The Knowledge Engineering Group (KEG) is addressing problems for knowledge extraction, representation, storage and management that the information era has brought in various segments of human activity due to data overload.
Fundamental theoretical aspects: dealing with problem-specific features extraction from both structured data, pre-processing techniques for handling noisy and/or incomplete data, learning from balanced/unbalanced and structured/unstructured data.
- Data Mining application prototypes for both structured data (assisted medical diagnosis, spam detection, signature recognition) and unstructured data (topic extraction, opinion mining, community detection, semi-supervised text labelling, contradiction detection).
- Business Intelligence application prototypes dealing with heterogeneous data integration by ontology-driven, (semi-) automatic design of unified data structures and automatic design of the corresponding ETL processes.
- handling incomplete records and irrelevant and/or redundant pieces of information, imbalanced class distribution and error costs
- identifying the right performance metric given the context, algorithm and model selection
- schema mapping and data fusion
- context-sensitive IR from unstructured sources
- community detection and opinion mining
- Recommendation systems - context sensitive, semantic driven recommendation systems for online advertisement and tourism
- Topic extraction and representation - identifying the topic polarity in a given document; projecting (very) large (un)structured data to relevant dimensions and providing representation to allow knowledge extraction
- Community detection- identifying clusters from implicit and/or explicit connections; community detection social data; opinion driven community detection
- User profiling - finding groups of individuals with similar features, finding/defining patterns for various profiles, predicting trends and future behaviour applied to the educational domain
- Contradiction Detection - opinion mining driven contradiction detection
- Medical decision support systems - assisting medical diagnosis in prostate cancer and rheumatoid diseases
PROJECTS
SEArCH - Adaptive eLearning Systems using Concept Maps
National grant funded by CNMP Program 4: Research partnership for priority domains, (2008-2011)
Homepage: http://search.utcluj.ro/
The goal of the project is to define a model of an adaptive e-learning environment, using Concept Maps.
Adaptive e-learning systems are the newest paradigm in modern learning approaches. Adaptive
presentation refers to content segmentation and management according to the student particularities and
goals, and is based on identification of the user's type. One of the key factors in such systems is the correct
and continuous identification of the user learning style, to provide the most appropriate content presentation
to each individual user. The means of attaining such objectives are the initial evaluation of the user for
identification of style and level of expertise. Based on those measurements, the content is presented
according to the type, providing an initial curriculum segmentation and adapted presentation. During the
learning process, based on dynamic on-going measurements, the user evaluation is continuously refined, in
the attempt of fitting the best the particular needs. Thus, the model ultimate goal is to correct identify user's
type, and continuously adapt the content (both in quntity and difficulty) according to its type. Currently, we
have investigated various ways for identifying the initial user typology, based on static features. We
proposed two solutions: using a Bayesian network, and by employment of a clustering method to determine
the different groups of learning typologies, corresponding to the theoretical learning styles present in
literature, based on the pretest (psicho-pedagogical).
ArhiNet - Integrated System for developing semantically-enhanced archive content
National grant funded by CNMP Program 4: Research partnership for priority domains,(2007-2010)
Homepage: http://coned.utcluj.ro/ARHINET/arch.html
This inter-disciplinary project addresses the study, development and management of interactive e-content for digital
enhancement of cultural heritage. The project aims at the study and development of an integrated system for creating
and managing archival content based on semantic enhancements. The domain ontology enhanced content allows for
semantically relevant information retrieval. The project also aims at the development of an information mining
subsystem and reasoning mechanisms to identify new correlations that will be added to the domain knowledge.
IntelPro - Intelligent system for assisting the therapeutically decision at patients with prostate cancer
National research grant funded by ANCS, CEEX - INFOSOC, (2005-2008)
Homepage: http://cv.utcluj.ro/intelpro/
The goal of our task in the project was to provide robust solutions which can be used to assist the physicians
in the diagnosis of prostate cancer, or as support in the learning process. The data-mining system speeds up the
diagnosis process and improves the accuracy of the diagnosis. The system could be extended to suggest possible
treatment, or courses of action in a particular case. It is not intended to replace the physician, but to support him.
The developed components have tried to tackle some of the particularities involved in mining medical problems.
Although the techniques we adopted so far are aimed at "solving" prostate cancer problems, they are not restricted to
this field. The methods can be extended to different medical problems, or we can go even further and apply them in
areas like loan applications, oil-slick detection, and so on.
GridMOSI - Virtual Organization using Grid Technology for High Performance Modeling, Simulation and Optimization
National research grant funded by ANCS, CEEX, (2005-2008)
Homepage: http://wiki.gridmosi.ro/wiki/GridMOSI:Info