Creating Evolving User
Behavior Profiles Automatically
ABSTRACT:
Knowledge about computer users is very
beneficial for assisting them, predicting their future actions or detecting masqueraders.
In this paper, a new approach for creating and recognizing automatically the
behavior profile of a computer user is presented. In this case, a computer user
behavior is represented as the sequence of the commands she/he types during
her/his work. This sequence is transformed into a distribution of relevant
subsequences of commands in order to find out a profile that defines its behavior.
Also, because a user profile is not necessarily fixed but rather it
evolves/changes, we propose an evolving method to keep up to date the created
profiles using an Evolving Systems approach. In this paper, we combine the
evolving classifier with a trie-based user profiling to obtain a powerful
self-learning online scheme. We also develop further the recursive formula of
the potential of a data point to become a cluster center using cosine distance,
which is provided in the Appendix. The novel approach proposed in this paper can
be applicable to any problem of dynamic/evolving user behavior modeling where
it can be represented as a sequence of actions or events. It has been evaluated
on several real data streams.
EXISTING
SYSTEM:
Most existing techniques for user
recognition assume the availability of handcrafted user profiles, which encode
the a-priori known behavioral repertoire of the observed user. However, the
construction of effective user profiles is a difficult problem for different reasons:
human behavior is often erratic, and sometimes humans behave differently
because of a change in their goals. This last problem makes necessary that the
user profiles we create evolve.
DISADVANTAGES
OF EXISTINGS SYSTEM:
In recent years, significant work has
been carried out for profiling users, but most of the user profiles do not
change according to the environment and new goals of the user.
PROPOSED
SYSTEM:
In this paper, we propose an adaptive
approach for creating behavior profiles and recognizing computer users. We call
this approach Evolving Agent behavior Classification based on Distributions of
relevant events (EVABCD) and it is based on representing the observed behavior
of an agent (computer user) as an adaptive distribution of her/his relevant
atomic behaviors (events). Once the model has been created, EVABCD presents an
evolving method for updating and evolving the user profiles and classifying an
observed user. The approach we present is generalizable to all kinds of user
behaviors represented by a sequence of events.
ADVANTAGES
OF PROPOSED SYSTEM:
1. It can cope with huge amounts and
data.
2. Its evolving structure can capture
sudden and abrupt changes in the stream of data.
3. Its structure meaning is very clear,
as we propose a rule-based classifier.
4. It is non-iterative and single pass;
therefore, it is computationally very efficient and fast.
5. Its classifier structure is simple and
interpretable.
MODULES:
1. Segmentation of the sequence of
commands.
2. Storage of the subsequences in a
trie.
3. Creation of the user profile.
Segmentation
of the Sequence of Commands
First, the sequence is segmented into
subsequences of equal length from the first to the last element. Thus, the
sequence A ¼ A1A2 . . .An (where n is the number of commands of the sequence)
will be segmented in the subsequences described by Ai . . .Aiþlength 8i; i ¼
½1; n _ length þ 1_, where length is the size of the subsequences created. In
the remainder of the paper, we will use the term subsequence length to denote the
value of this length. This value determines how many commands are considered as
dependent.
Storage
of the subsequences in a trie.
The subsequences of commands are stored
in a trie data structure. When a new model needs to be constructed, we create
an empty trie, and insert each subsequence of events into it, such that all
possible subsequences are accessible and explicitly represented. Every trie
node represents an event appearing at the end of a subsequence, and the nodes children
represent the events that have appeared following this event. Also, each node
keeps track of the number of times a command has been recorded into it. When a
new subsequence is inserted into a trie, the existing nodes are modified and/or
new nodes are created. As the dependencies of the commands are relevant in the
user profile, the subsequence suffixes (subsequences that extend to the end of
the given sequence) are also inserted.
Creation
of the user profile.
Once the trie is created, the
subsequences that characterize the user profile and its relevance are
calculated by traversing the trie. For this purpose, frequency-based methods
are used. In particular, in EVABCD, to evaluate the relevance of a subsequence,
its relative frequency or support is calculated. In this case, the support of a
subsequence is defined as the ratio of the number of times the subsequence has
been inserted into the trie and the total number of subsequences of equal size
inserted.
Chart
Generation for User/Group:
Two data sets reported in are used to examine our proposed model
for creating evolving user behavior learning. The first data set is acquired
from user interest, the second from concerning behavior; we study whether or
not a user visits a group of interest. Then generates chart the based on the
user visit group in the month.
SYSTEM
REQUIREMENTS:
HARDWARE
REQUIREMENTS:
•
System : Pentium IV 2.4 GHz.
•
Hard
Disk : 40 GB.
•
Floppy
Drive : 1.44 Mb.
•
Monitor : 15 VGA Colour.
•
Mouse : Logitech.
•
Ram : 512 Mb.
SOFTWARE
REQUIREMENTS:
•
Operating system : - Windows XP.
•
Coding Language : ASP.NET, C#.Net.
•
Data Base : SQL Server 2005
REFERENCE:
Jose Antonio Iglesias, Member, IEEE
Computer Society, Plamen Angelov, Senior Member, IEEE Computer Society, Agapito
Ledezma, Member, IEEE Computer Society, and Araceli Sanchis, Member, IEEE
Computer Society, “Creating Evolving User Behavior Profiles Automatically”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 24, NO. 5, MAY 2012.