Usage Analytics - Research Direction To Discover Insight From Cloud Based Applciation
Usage Analytics - Research Direction To Discover Insight From Cloud Based Applciation
Usage Analytics - Research Direction To Discover Insight From Cloud Based Applciation
Cloud-based Applications
Abstract: Usage in the software field deals with knowledge about how end-users use the application and how the ap-
plication reacts to the users’ actions. In a complex and heterogeneous cloud computing environment, the
process of extracting and analysing usage data is difficult since the usage data is spread across various front-
end interfaces and back-end underlying infrastructural components of the cloud that host the application. In
this paper we propose Usage Analytics, a set of potential research directions that could help tackle various
challenges in the cloud domain. We provide an overview of usage analytics in the cloud environment and
propose how to discover insights using these analytics solutions. We give some discussions about challenges
in discovering insights from the usage data as well as provide vision of how usage data will bring benefits to
the cloud environment.
254
Dang-Nguyen, D., Kesavulu, M. and Helfert, M.
Usage Analytics: Research Directions to Discover Insights from Cloud-based Applications.
DOI: 10.5220/0006764002540261
In Proceedings of the 7th International Conference on Smart Cities and Green ICT Systems (SMARTGREENS 2018), pages 254-261
ISBN: 978-989-758-292-9
Copyright c 2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
Usage Analytics: Research Directions to Discover Insights from Cloud-based Applications
data generated by the users. User interests can be usage analytics in practice, we should fuse a broad
modelled by extracting browsing behaviour when ac- spectrum of domain knowledge and expertise, from
cessing web application (Gasparetti, 2016). Such an- management, machine learning, data processing to in-
alytical solutions are considered as increasingly criti- formation visualization. By using usage analytics, we
cal tools for modern enterprise to get an informational are aiming at proposing powerful tools to address the
advantage, and have evolved from a matter of choice following challenges (deeper discussions can be seen
to a fundamental requirement in the present competi- in (Kesavulu et al., 2018)):
tive business environments. Applying these solutions,
thus, is a key to discover insights from the applica- 1. Resource Provisioning: based on the usage data,
tions’ usage. predict the resources that may be allocated to an
application.
An intuitive solution is to survey users on
how customers use these applications through well- 2. Problem Diagnosis: analyzing the usage data,
designed studies (interviews or surveys). Unfortu- which are the logs in this case, to understand how
nately, this approach has different limitations such as to localize the node that is the source of perfor-
cost to conduct the studies, inability to include large mance problems.
population and users may not be willing to or able to 3. Understanding User Satisfaction: instead of sur-
self-identify and so on. These issues can be addressed veying and asking feedback, how users satisfy with
by using data analytics on the usage data, namely us- an application and be revealed via their usage data.
age analytics, which aims to obtain insightful and 4. Discovering User Behaviour Patterns: Every
actionable information for data-driven tasks, around user has their own pattern when using an appli-
applications and services. With the improving of the cation or a service. Understanding these patterns
data mining tools, these usage data can be gathered could help to improve the service or discover the
from online services by collecting all traces of user trends in advance. These patterns, can be discov-
activity to produce clickstreams, sequences of times- ered from the usage data.
tamped events generated by user actions. For web-
based services, these might include detailed HTTP re- Consequently, the aims of this paper are:
quests. For mobile applications, clickstreams can in-
clude everything from button clicks, to finger swipes • To provide an overview of what is usage analytics
and text or voice input (Wang et al., 2016). Insightful in the cloud environment and propose a usage data
information is information that conveys meaningful classification;
and useful understanding or knowledge towards pro- • To inspire and motivate researchers to use their
viding the target service or user satisfaction to that know-how in this new emerging and important
service (Zhang et al., 2011). Typically insightful in- area;
formation is not easily achievable by directly inves- • To propose how future usage data in the cloud ap-
tigating the raw data without aid of analytical solu- plications should be extracted, and from that in-
tions. Developing a usage analytics project typically sights can be discovered via appropriate analytics
goes through iterations of four phases: task defini- techniques;
tion, data preparation, analytic-technology develop-
ment, and deployment and feedback gathering. Task • To discuss the challenges in discovering insights
definition is to define the target task to be assisted from the usage data and provide vision of how us-
by software analytics. Data preparation is to collect age data will bring benefits to the cloud environ-
data to be analysed. Analytic-technology develop- ment.
ment is to develop problem formulation, algorithms, The remainder of the paper is structured as fol-
and systems to explore, understand, and discover in- lows: in Section 2, we discuss the problems and chal-
sights from the data. Deployment and feedback gath- lenges that exist in understanding what does usage
ering involves two typical scenarios. One is that, as data mean, challenges in extracting information from
researchers, we have obtained some insightful infor- usage data and building an usage data analytics frame-
mation from the data and we would like to ask domain work and discuss the general requirements for the us-
experts to review and verify. The other is that we ask age analytics framework. We then describe the poten-
domain experts to use the analytic tools that we have tial analytics methods that aim to provide solutions
developed to obtain insights by themselves. Most of in Section 3. In Section 4, we discuss the potential
the times it is the second scenario that we want to en- of this research and the corresponding challenges in-
able. volved. Finally, in Section 5, we provide conclusions
In this paper, we will show that when applying drawn.
255
SMARTGREENS 2018 - 7th International Conference on Smart Cities and Green ICT Systems
Visualisation with
correlations
Application logs
Raw Data: Assessment data (wiki, forum, message),
activity data (click, time spent), content data (posts,
discussions), event data.
2 PROBLEMS, CHALLENGES usually use these logs to track a system to detect and
AND REQUIREMENTS diagnose system anomalies.
The second type of this data is the user-level us-
age data generated as a result of user interaction with
In this section, we address a couple of basic questions a cloud-based application. Some examples of usage
in this topic: “what are the usage data?”, “where can data are application logs, for example the assessment
we get them?”, “what are the challenges?” and “what data (wiki, forum, message), the activity data (clicks,
are the requirements in analysing these data?”. time spent), server logs, and so on. They can be
extracted by the applications themselves or via Web
2.1 What are the Usage Data? cookies (from web browser). Such data in the cloud is
spread across various interfaces such as Web browser,
Usage data, as its name suggested, are data gener- mobile applications and command line interfaces on
ated when users or customers are using the applica- the front-end and server and database on the back-
tions and services. They can be extracted from any end.
stage when services and/or applications are being pro- The last type of usage data is the VM logs, typ-
vided. In this study, we propose to group these data ically generated from the VMs running the applica-
into three groups, coming from three main sources, as tions or services. This type of logs contains the usage
follows: the system logs from the cloud services from of the CPU, memories, as well as running tasks, time
the back-end of the cloud system, the application logs, of starting and stopping and others.
and the logs from the virtual machines (VMs). Fig. 1
shows a summary of the three main sources of usage 2.2 Challenges in Extracting
data and the answers for the above questions. Information from Usage Data
System logs contain a wealth of information to
help manage systems. Most systems print out logs As mentioned above, usage data can be extracted at
during their executions to record system runtime ac- any stage, and the major challenge is they can be in
tions and states that can directly reflect system run- any form and format, which brings many challenges
time behaviours. System developers and architects to the analysis. The main questions for usage data
256
Usage Analytics: Research Directions to Discover Insights from Cloud-based Applications
extractor is what usage data should be extracted and 2.4 Usage Analytics Requirements
how to map the raw usage data with the right applica-
tions or services. Considering the multi-tenant archi- Usage analytics requires technologies to efficiently
tecture of the cloud, different applications share the process and discover insights from large and diverse
same physical and virtual resources. This raises chal- usage data. Summing up from several previous stud-
lenge as in how to separate and extract the logs that ies in software analytics and cloud computing((Aceto
represent each application from the instance (VM) co- et al., 2012), (Jehangiri et al., 2013)), we point here
hosting the applications. the general requirements for a usage analytics frame-
Another important challenge is handling with dif- work:
ferent contextual information. A system usually has a
lot of branches, and thus the systems behaviors may 1. Scalability: The system should be scalable, i.e.,
be quite different under different input data or envi- it should be able to handle a large number of us-
ronmental conditions. Knowing the execution behav- age data extractors. This requirement is very im-
ior under different inputs or configurations can greatly portant in the cloud environments due to a large
help system operators to understand system behav- number of services and structures of cloud sys-
iors. However, there may be a large number of differ- tems that may grow elastically.
ent combinations of inputs or parameters under dif- 2. Heterogeneous data: The system should consider
ferent system behaviors. Such complexity poses dif- a heterogeneous group of metrics. It should allow
ficulties for analyzing contextual information related to deal with usage data at different levels: service
to the state of interest. level (application), virtual IT-infrastructure level
(e.g., VM logs), and fine-grained physical IT-
2.3 Challenges in Building Usage infrastructure level (e.g., the cloud systems logs).
Analytics Framework 3. Relationship: There is a hierarchical relationship
between applications, VMs and the physical ma-
Despite the advantages provided by analytical so- chines. These relationships can be changed dy-
lutions, the solution implementation is usually very namically, thus, the analytics system has to cope
costly, which hinders enterprises, especially the also this aspect.
SMBs (Small and Medium Business), to start such 4. Meaningful: The extracted usage data must be
projects (Sun et al., 2012). Normally, storing huge meaningful from a variety of sources, i.e., the sys-
volumes of data requires a large storage system as tem should be able to filter out non-relevant infor-
well as buying (and training how to use) expensive mation, e.g., noise data. Furthermore, data extrac-
analytics software. This will come also with a large tors should easily be extended, for example, by
number of clusters and powerful machines to run data adding more plugins.
analytics algorithms.
Another challenge that related to cost-effective is 5. Abstraction: The usage analytics need to ex-
limited to only a small number of large companies haustively aggregate runtime data from different
or enterprises can effort to run usage analytics frame- sources and consolidate information at a high
work. They normally have to pay a lot to maintain the level of abstraction.
complex software and hardware only for occasional 6. Identification of Influential Metrics: The system
usages, for example when a financial quarter is over should be able to identify the metrics or param-
or some unusual events happen (Sun et al., 2012). eters that strongly influence the decision making,
The framework also have to be constantly updated which will help in decreasing the time and com-
since the cloud-based environment is also moving plexity in analysis.
very quickly. The built framework should have the
ability to predict and adapt with the changes of tech-
nologies and able to process new coming types of us-
age data or new types of services and applications. 3 ADDRESSING THE
Last but not least, usage analytics framework re- CHALLENGES
quires high skill analysts to run the analytical solu-
tions since it requires constant tuning, validating, and In this section, we describe the potential analytics
updating according to the changing business context, methods that aim to provide useful tools to solve
as well as the manner of services and applications. the four major problems (summarised in Table 1)
mentioned in the introduction: resource provisioning,
problems diagnosis, understanding user satisfaction,
257
SMARTGREENS 2018 - 7th International Conference on Smart Cities and Green ICT Systems
and discovering user behaviour patterns. Some evalu- a method named Smart Predictive Capacity Manage-
ation methods will be also introduced. ment (SPCM) that is designed to assist cloud net-
working deployment in estimating the acceptable net-
3.1 Resource Provisioning work capacity for a specific configuration of interde-
pendent VMs by predicting individual VM states. It is
Let us start with the usage data from the left-most done by applying Markov chain techniques to address
branch from the diagram in Figure 1. A typical prob- the data analytics for potential states in heterogeneous
lem on cloud-based environment is the network re- cloud computing environment. This work could help
source management, for example the acceptable Vir- enterprises to optimise the VM configurations to at-
tual Machine (VM) configuration to minimize the tain significant performance improvement.
resource consumed by certain services deployed on
these VMs. A common problem experienced in data 3.2 Problem Diagnosis
centers and utility clouds is the lack of knowledge
about the mappings of the services being run by or of- With the increasing scale and complexity of the
fered to external users to the sets of virtual machines cloud-based applications, it has become more and
(VMs) that implement them (Wang et al., 2011). It more difficult for system operators to understand the
can be done by exploiting analytics methods, for ex- behaviors of system for tasks such as system prob-
ample by predictive analysis on the usage data from lem diagnosis. For example, system operators need
the systems logs from each VM, to predict the suit- to understand system execution behaviors to identify
able configuration for future VM deployments. symptoms and root-causes of anomalous nature of
For an in-time decision making, Wang et al. the system. System behaviors include a series of ac-
in (Wang et al., 2011) proposed a system integrat- tions executed by the system and the corresponding
ing monitoring with analytics, termed Monalytics, changes in the system states. Although operators usu-
which can capture, aggregate, and incrementally an- ally investigate a system starting from a specific state
alyze data on-demand and in real-time, thus increas- of interest, e.g., a hang state or failure state, it is criti-
ing accuracy and reducing human intervention in the cal to identify the series of states the system traversed
analysis process. It was done by applying a cluster- to reach the current unstable state.
ing algorithm and a top-k flow analysis (Kumar et al., The study in (Fu et al., 2013) proposes a new ap-
2004) on the data gathered from the CPU usage data proach for the contextual analysis of system logs to
on each VM, identifying the VMs that are responsible better understand a systems behaviors. In particular,
for the majority of the traffic flow in the group. This they used execution patterns extracted from the sys-
provides information on critical VM combinations to tem logs that ultimately reflect the runtime behavior
include in the same group to achieve maximum cost of the application, and propose an algorithm to mine
benefit during VM migration. execution patterns from the system logs. Based on the
Usage data on VM can also be exploited to predict execution patterns, their approach further learns the
VM states that can be used as the inputs of the ex- essential contextual factors by modelling the relation-
isting networking capacity management techniques. ships among execution patterns that are responsible
For example, in (Sun, 2016), the authors proposed for the execution of specific branch of the system.
258
Usage Analytics: Research Directions to Discover Insights from Cloud-based Applications
In this study, we also propose a contextual anal- tion and manipulation of analysis results as well as
ysis method, inspired by study in (Fu et al., 2013), predict the engagement level which helps the evalua-
by analysing the application logs to better understand tion of the applications and services.
the correlation between users behaviors and the corre-
sponding system. In particular, we propose to use the 3.4 Discovering User Behavior Patterns
Formal Concept Analysis (Ganter and Wille, 1997) to
mine execution patterns from the usage data from all In order to understand user behavior, descriptive
sources: application, hosting VM(s), and underlying statistics, e.g., mean, total, standard variation, most
cloud logs. The execution patterns in this context can frequent value, etc., are typically used to obtain mean-
be considered as reflections of user’s interactions with ingful insights such as the basic behaviors of the
the application. The mining and learning results can users. These information can be also used to classify
help system operators understand both the behaviors the user based on the correlation and demographic
of the customers as well as the execution logic of their similarities among them. In order to understand the
services and applications. patterns from user behavior, we propose to exploit all
of the usage data from multiple layers of the cloud
3.3 Understanding User Satisfaction environment, usage data of a cloud-based application
is spread across front-end interfaces (web-browser,
Visualisation is a typical way to help the system ana- smart phone app/client and command-line interface)
lysts understand how user satisfy with their services. and the back-end (server instance and database in-
These information can be grasped more easily and stance) in a cloud environment (Kesavulu et al., 2017)
quickly when presented through comprehensible in- and formulate as the transition states of a graph. This
formation visualizations, for example by the means type of graph can be used to mine execution patterns
of charts, graphs, from the basic interaction usage and to model relationships among different user be-
data (as shown in the bottom branch of the schema havior patterns. This kind of approach can also be
in Figure 1). This problem has been studied for used to discover some problems under some specific
decades in learning and multiple ways of visualizing context. To discover these contextual factors, we pro-
data increase the perceived value of different feed- pose to use the decision trees to learn the conditions,
back types (Dyckhoff et al., 2012). which allows us to determine any possible connec-
A typical way to access the cloud-based applica- tions between the contexts and change in behavior of
tions by the end-user is through a web-browser. Data the user.
analytics techniques, e.g., web-mining, in this way, It is worth noting that these usage data poten-
can be employed to obtain interaction insights. In tially can be exploited in situation-emotional analyt-
(Bucklin and Sismeiro, 2009), the authors provided ics (Märtin et al., 2017), which aims at recognizing
an overview of Clickstream data, defined as the elec- the emotions and changes of software situations in or-
tronic record of a user’s activity, represents the traces der to improve the quality and user experience levels.
of an end-user takes while accessing the cloud appli- These emotional information are now extracted via
cation. Analyse such kind of information can discover external biometric recording devices, e.g., recording
how a user satisfy with the provided services based on devices that record the eye and gaze-tracking signal.
the interaction of the users (obtained via the number We firmly believe that, usage information at the appli-
of clicks, time spent, and other usage data). cation levels, will be very useful for this type of learn-
Usage analytics can also reveal the engagement ing and potentially can replace eye and gaze-tracking
level of customers during the development and evalu- information.
ation process of a software analytic project. It is well
recognized that engaging customers is a challenging 3.5 Evaluations and Validations
task especially in the context of software engineering
tools. Customers always tend to keep their existing Evaluation itself is a challenge in usage analytics. For
way of carrying out a task or the way of using a ser- example, if an analytical solution provides some in-
vice. Furthermore, it is usually lacking of investing formation about the user satisfaction, it is non-trivial
time to understand the pros and cons of the proposed to evaluate if that information shows the real “pic-
tools due to tight development schedule. Thus, un- ture” of how user satisfaction can be achieved. Tra-
derstanding customer engagement has significant im- ditionally, we need to run some surveys and/or inter-
pact on the development of the applications and ser- views with the actual users to evaluate results from
vices. By visualisation also predictive analysis, ana- the analytical solutions. Another way is to ask some
lytics tool can provide providing effective visualiza- domain experts to evaluate the proposed solutions.
259
SMARTGREENS 2018 - 7th International Conference on Smart Cities and Green ICT Systems
This method, however, is also costly and very sub- 4.2 Possible Research Challenges and
jective. In this study, we propose a novel way for the Projects
evaluation by using another type usage information,
gathering from the snapshots of the users’ device, the
Usage analytics comes also with challenges and op-
screen-shots of the interfaces used to access the appli-
portunities for researchers. It is important that the re-
cation. By collecting these information (in the testing
search community helps to address the challenges in
phase), we can provide users with the ability to recall
this emerging and important field. We cannot easily
and re-access their previous computer usage and the
apply our existing analytics methods on this type of
content they engage with.
data and hope for success. Therefore we need specific
approaches addressing the specific challenges. As a
first way-point for researchers we are proposing dif-
4 REALISING THE POTENTIALS ferent research topics and research questions.
• How Can We Identify and Extract Important
We firmly believe that usage analytics will very soon
Information from the Usage Data? Deciding
be a phenomenon in anyone working on cloud-based
what should be extracted from the data that is im-
environment, and will positively impact on everyone
portant is a nontrivial task. Going beyond stan-
who uses the technology. In this section, we point out
dard analysis like predicting some satisfaction level
the potential applications as well as possible research
will be important, forcing researchers to think cre-
challenges and projects on usage analytics.
atively and go beyond simple analysis. Many re-
search questions arises, such as, how to combine
4.1 Potential Applications information from different raw usage data, or how
to efficiently process the data. Also an important
In our vision, usage analytics opens up a new part here is to explore how context and situations
paradigm of opportunities, namely: can be taken into account to improve the quality of
• Enhancing Productivity without Running the analysis.
Never-ending Surveys. Most of companies run • How Can We Present the Results to the User,
surveys and collecting feedback from customers to i.e., the Company? Reporting the results to the
know how their services being used. This is costly users is one of the most important parts of the anal-
and time-consuming. With usage analytics, these ysis of this data. Nevertheless, this is not trivial
information can be obtained almost in real-time, since the amount of data and information that can
and even better, more reliable. Given a set of usage be extracted is huge. It will be important to re-
data from different users over time (historical search novel interfaces that allow users easily get-
usage data), how they use the applications, what ting the root causes of the errors or understand-
could make errors, how much resources should be ing the engagement of their customers. Generating
allocated, and so on can be provided. summaries and automatic reports will be another
• A Greater Knowledge of the System. Usage an- topic that is important for this data since there will
alytics can discover hidden user behavior patterns, be a need from the user side for such summaries
providing information which would go unnoticed. with respect to, for example, weekly report from
They can identify trends and patterns from their the whole systems.
customers as well as their own systems, allowing • How Can Information and Usage Data Be Pro-
better services with less expensive resources. cessed Efficiently? Systems that have to process
• Improving the Services and the Architecture Be- a huge amount of data in a complex way have to
hind Them. Traditionally, software and cloud- be efficient to make them useful to the users. This
based applications are upgraded as an increasing comes with challenges for researchers in terms of
of their versions, which requires a lot of time for how to parallelize and process data efficiently in a
collecting customers’ feedback and system diagno- reasonable amount of time, how to combine differ-
sis results. By using usage analytics, this process ent research fields, from software analytics to ma-
can be simplified and the services potentially can chine learning, together.
be constantly update, seamlessly and reliably.
The potential for usage analytics is enormous. We
do acknowledge that there are challenges to be over-
come, such as finding the right analytics techniques,
synchronization, data extraction, and the development
of a new generation of analytics tools on usage data,
260
Usage Analytics: Research Directions to Discover Insights from Cloud-based Applications
but we believe that these will be overcome and that we Ganter, B. and Wille, R. (1997). Formal Concept Analy-
are on the cusp of a positive turning point for cloud- sis: Mathematical Foundations. Springer-Verlag New
based applications community. York, Inc., Secaucus, NJ, USA, 1st edition.
Gasparetti, F. (2016). Modeling user interests from web
browsing activities. Data Mining and Knowledge Dis-
covery, 31(2):1–46.
5 CONCLUSIONS Jehangiri, A. I., Yaqub, E., and Yahyapour, R. (2013). Prac-
tical aspects for effective monitoring of slas in cloud
computing and virtual platforms. In International
We presented the challenges in getting insights from Conference on Cloud Computing and Services Sci-
cloud based applications. We pointed out that ana- ence, pages 447–454.
lytical solutions on usage data, namely usage analyt- Kabbedijk, J., Bezemer, C.-P., Jansen, S., and Zaidman, A.
ics, can help to overcome these challenges. A com- (2015). Defining multi-tenancy: A systematic map-
plete picture of how to apply usage analytics to get ping study on the academic and the industrial perspec-
tive. Journal of Systems and Software, 100:139–148.
insights from the cloud-based applications and ser-
Kesavulu, M., Dang-Nguyen, D.-T., Helfert, M., and
vices is shown and discussed. Some potential applica- Bezbradica, M. (2018). An Overview of User-level
tions were also addressed. Our future work aims (i) at Usage Monitoring in Cloud Environment. In The UK
designing and developing methods/techniques to col- Academy for Information Systems (UKAIS).
lect, extract and/or aggregate the usage-data from Ap- Kesavulu, M., Helfert, M., and Bezbradica, M. (2017). A
plications, VMs hosting the application and the cloud Usage-based Data Extraction Framework for Cloud-
system hosting the VMs; (ii) to develop an experi- Based Application - An Human-Computer Interac-
tion approach. In International Conference on
ment to evaluate the usage data extraction and analy- Computer-Human Interaction Research and Applica-
sis methods. tions (CHIRA), Madeira, Portugal.
Kumar, A., Sung, M., Xu, J. J., and Wang, J. (2004). Data
streaming algorithms for efficient and accurate esti-
mation of flow size distribution. SIGMETRICS Per-
ACKNOWLEDGEMENTS form. Eval. Rev., 32(1):177–188.
Märtin, C., Herdin, C., and Engel, J. (2017). Model-
This work was supported with the financial support based User-Interface Adaptation by Exploiting Sit-
of the Science Foundation Ireland grant 13/RC/2094 uations, Emotions and Software Patterns. Interna-
and co-funded under the European Regional Develop- tional Conference on Computer-Human Interaction
Research and Applications.
ment Fund through the Southern & Eastern Regional
Mell, P. and Grance, T. (2011). The NIST definition of
Operational Programme to Lero - the Irish Software cloud computing. NIST Special Publication, 145:7.
Research Centre (www.lero.ie). Pachidi, S., Spruit, M., and Van De Weerd, I. (2014).
Understanding users’ behavior with software opera-
tion data mining. Computers in Human Behavior,
30(January):583–594.
REFERENCES Sun, X. (2016). Virtual Machine Optimizations Us-
ing Markov Chain Data Analytics in Heterogeneous
Aceto, G., Botta, A., de Donato, W., and Pescap, A. (2012). Cloud Computing. In International Conference on
Cloud monitoring: Definitions, issues and future di- Smart Cloud, pages 248–253.
rections. In IEEE International Conference on Cloud Sun, X., Gao, B., Fan, L., and An, W. (2012). A cost-
Networking (CLOUDNET), pages 63–67. effective approach to delivering analytics as a service.
Bezemer, C.-P., Zaidman, A., Platzbeecker, B., Hurkmans, In International Conference on Web Services, pages
T., and Hart, A. . (2010). Enabling multi-tenancy: An 512–519.
industrial experience report. In IEEE International Wang, C., Schwan, K., Talwar, V., Eisenhauer, G., Hu, L.,
Conference on Software Maintenance, pages 1–8. and Wolf, M. (2011). A flexible architecture integrat-
Bucklin, R. E. and Sismeiro, C. (2009). Click here for ing monitoring and analytics for managing large-scale
internet insight: Advances in clickstream data anal- data centers. ACM International Conference on Auto-
ysis in marketing. Journal of Interactive Marketing, nomic Computing, page 141.
23(1):35–48. Wang, G., Zhang, X., Tang, S., Zheng, H., and Zhao, B. Y.
Dyckhoff, A. L., Zielke, D., Bültmann, M., Chatti, M. A., (2016). Unsupervised Clickstream Clustering for User
and Schroeder, U. (2012). Design and implementa- Behavior Analysis. Proceedings of the 2016 CHI Con-
tion of a learning analytics toolkit for teachers. Edu- ference on Human Factors in Computing Systems -
cational Technology and Society, 15(3):58–76. CHI ’16, pages 225–236.
Fu, Q., Lou, J.-G., Lin, Q., Ding, R., Zhang, D., and Xie, Zhang, D., Dang, Y., Lou, J.-G., Han, S., Zhang, H., and
T. (2013). Contextual analysis of program logs for Xie, T. (2011). Software analytics as a learning case
understanding system behaviors. Proceedings of the in practice. Proceedings of the International Work-
10th Working Conference on Mining Software Repos- shop on Machine Learning Technologies in Software
itories, pages 397–400. Engineering - MALETS ’11, pages 55–58.
261