Deriving Concept-Based User Profiles From Search Engine Logs
Deriving Concept-Based User Profiles
From
Search Engine Logs
Abstract
User
profiling is a fundamental component of any personalization applications. Most
existing user profiling strategies are based on objects that users are interested
in (i.e., positive preferences), but not the objects that users dislike (i.e.,
negative preferences). In this paper, we focus on search engine personalization
and develop several concept-based user profiling methods that are based on both
positive and negative preferences. We evaluate the proposed methods against our
previously proposed personalized query clustering method. Experimental results
show that profiles which capture and utilize both of the user’s positive and
negative preferences perform the best. An important result from the experiments
is that profiles with negative preferences can increase the separation between
similar and dissimilar queries. The separation provides a clear threshold for
an agglomerative clustering algorithm to terminate and improve the overall
quality of the resulting query clusters.
Existing System:
Existing click
through-based user profiling strategies can be categorized into document-based
and concept based approaches. They both assume that user clicks can be used to
infer users’ interests, although their inference methods and the outcomes of
the inference are different. Document-based profiling methods try to estimate
users’ document preferences (i.e., users are interested in some documents more
than others) .On the other hand, concept based profiling methods aim to derive
topics or concepts that users are highly interested in. These two approaches
will be reviewed in. While there are document-based methods that consider both
users’ positive and negative preferences, to the best of our knowledge, there
are no concept-based methods that considered both positive and negative
preferences in deriving user’s topical interests.
Problems:
- personalized search is not implemented to display the relevant(user specific) results.
- Most existing user profiling strategies only consider documents that users are interested in (i.e., users’ positive preferences) but ignore documents that users dislike (i.e., users’ negative preferences).
- short and ambiguous queries are unable to express the user’s precise needs. And same results for the same query are displaying regardless of the user’s real interest
Proposed System:
Personalized search is an
important research area that aims to resolve the ambiguity of query terms. To
increase the relevance of search results, personalized search engines create
user profiles to capture the users’ personal preferences and as such identify
the actual goal of the input query. Since users are usually reluctant to explicitly
provide their preferences due to the extra manual effort involved, recent
research has focused on the automatic learning of user preferences from users’
search histories or browsed documents and the development of personalized
systems based on the learned user preferences. A good user profiling strategy
is an essential and fundamental component in search engine personalization. We
studied various user profiling strategies for search engine personalization,
and observed the following problems in existing strategies. Most
personalization methods focused on the creation of one single profile for a
user and applied the same profile to all of the user’s queries. We believe that
different queries from a user should be handled differently because a user’s
preferences may vary across queries. For example, a user who prefers
information about fruit on the query “orange” may prefer the information about
Apple Computer for the query “apple.” Personalization strategies such as
employed a single large user profile for each user in the personalization
process.
Advantage:
- We extend the query-oriented, concept-based user profiling method proposed in to consider both users’ positive and negative preferences in building users profiles.
- We proposed six user profiling methods that exploit a user’s positive and negative preferences to produce profiles. (SVM-Decision making)
- RSVM to learn from concept preferences weighted concept vectors representing concept-based user profiles.
- We show that profiles which capture both the user’s positive and negative preferences perform best among all of the proposed methods. We also find that the query clusters obtained from our methods are very close to the optimal clusters.
Architecture
HARDWARE & SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
·
System : Pentium IV 2.4 GHz.
·
Hard Disk : 40
GB.
·
Floppy Drive : 1.44 Mb.
·
Monitor : 15 VGA Color.
·
Mouse : Logitech.
·
Ram : 512 MB.
SOFTWARE
REQUIREMENTS:
·
Operating system : Windows XP Professional.
·
Coding Language : java(jdk1.6.0)
·
Front
End :
Struts Framework
·
Back End :
Oracle 10g
Modules
Description
1.
Concept
based user selection
2.
User
log information
3.
Support
identification based on the concept
4.
Weight
generation
5.
Precision
and Recall
Concept Based User Selection:
Concept-based user profiling
strategies that are capable of deriving both of the users’ positive and
negative preferences. The entire user profiling strategies is query-oriented,
meaning that a profile is created for each of the user’s queries. The user
profiling strategies are evaluated and compared with our previously proposed
personalized query clustering method.
User Log information:
User profiling strategies can be
broadly classified into two main approaches: document-based and concept-based
approaches. Document-based user profiling methods aim at capturing users’
clicking and browsing behaviors. Users’ document preferences are first
extracted from the click through data, and then, used to learn the user
behavior model which is usually represented as a set of weighted features. On
the other hand, concept-based user profiling methods aim at capturing users’
conceptual needs. Users’ browsed documents and search histories are
automatically mapped into a set of topical categories. User profiles are
created based on the users’ preferences on the extracted topical categories.
Support identification based on the
concept:
Support to learn from concept
preferences weighted concept vectors representing concept-based user profiles.
The weights of the vector elements, which could be positive or negative,
represent the interestingness (or un interestingness) of the user on the
concepts. In, the weights that represent a user’s interests are all positive,
meaning that the method can only capture User’s positive preferences.
Weight Generation:
To evaluate the proposed user
profiling strategies and compare it with a baseline proposed in. We show that
profiles which capture both the user’s positive and negative preferences
perform best among all of the proposed methods. We also find that the query
clusters obtained from our methods are very close to the optimal clusters.
Precision and Recall:
Optimal clusters to be the
clusters obtained by the best termination strategies for initial clustering and
community merging .The optimal clusters are compared to the standard clusters
using standard precision and recall measures q is the input query, Q relevant
is the set of queries that exists in the predefined cluster for q, and Q
retrieved is the set of queries generated by the clustering algorithm. The
precision and recall from all queries are averaged to plot the precision-recall
figures, comparing the effectiveness of the user profiles.
Algorithms: Personalized
Agglomerative Clustering
hi,... can u send me full project to my mail id..It may helpful for my academics... santhoshkumar547@gmail.com
ReplyDeletethank you in advance..