Embed code
Note: this content has been automatically generated.
Supporting and advocating
internationally for the full engagement
of women in all aspects of the
computing field ACMW is most
prestigious award is the Athena
lecturer award this celebrates women
researches with make fundamental
contributions to computer science each
year a CMW on as one woman as the if
you know lecturer this year's lecture a
recipient of the lecture award is Susan
sue today as many of you already know
in Greek mythology Athena is the
goddess of wisdom courage inspiration
civilisation lower injustice
mathematics strength the arts crafts
skill and more strategy I think that's
his favourite I think I think Nancy for
a very long time and haven't been lucky
enough to have had support and advice
from her in my own career. I can
honestly say she has many Athena like
qualities and traits. She's the string
of crafted takes to piss you difficult
problems in HCI in computer science and
to communicate those those ideas with
great I like and it's she was born and
raised in that was nine to gotta be a
in mathematics and psychology from a
liberal arts college. She went on to
graduate with a PHD in cognitive
psychology from Indiana university just
made a really strong contributed to our
high community for very many as she
started out her curry a bell labs and
bellcore. And she in nineteen ninety
seven went to Microsoft research where
she has buildings stellar
groundbreaking research and also acting
as an adjunct professor at the
university of Washington she presented
to paper at the very first a CM chi
conference in nineteen eighty three
this paper was called using examples to
just gripe categories. She also has the
distinction of presenting a very
memorable paper a with a very hard
because title at the high nineteen
ninety two because the meeting I'm
gonna try and say yeah on these
statistical semantics how can a
computer use what people named things
to guess what people mean when they
name things. She was clearly ahead of
the times and the times that she was
already struggling with the heart
problems that she is now really
bringing big solutions to what we call
personalisation in search and
information we keep retrieval this
theme runs throughout sues research.
And she's always been very
interdisciplinary and how to use
ascended perspective shakes a very
broad approach and really tries to
bring uses in everything she does.
She's been recognised in the HCI
community the IR community and the web
sciences community for her research.
And how awards are equally broad. And
well does the she was inducted into the
pile academy in two thousand and five
recognises a CM following two thousand
six should receive the same guy a
Gerhard sultan award for lifetime
achievement in two thousand and nine
and was inducted into the national
academy of engineering in two thousand
eleven to receive that only constructs
award in two thousand and fourteen. So
I'm very happy that she's also now that
the you know lecturer. She's been a
tireless contribution in terms of
service that high as well. She's been
involved and number program committees.
She's run a number of doctoral
consumption and she has mended many
people informally. She injury also and
with a quite PC coach as in nineteen
ninety four and at that time some of
you remember submissions what on paper
Paper remember that remembers state
that I was somewhat note that do Dawson
was recognised as an at the lecture in
two thousand and eleven. So this
incredibly powerful team have continued
to be influential and happily at being
recognised. So is the first person from
industry to be recognised is the if you
know lecturer. And she does illustrates
it's possible to do amazing research
inspired generations of researches
ideas as well as younger researchers.
And influence products that have an
effect on the lives of millions is is a
really well deserved award a sue and
it's a great achievement ask the HCI
community. I would really like you all
to congratulate and welcome suited Meta
the stage. Thank you. yeah excellent
okay thank you Elizabeth for the the
wonderful introduction I'm deeply
honoured by by the support especially
since the nomination was by my peers
both in HCINI R.s Elizabeth noted I
really value the importance over broad
interdisciplinary perspective an attack
attacking problems. And it's it's
wonderful to see this recognise. But
it. It's not just an award for me it's
really an award for everybody in this
room into the broader high community is
a list beside a CM awards one of these
a year and so I I take it as a
recognition by a CM of the importance
of the kinds of corporate pursuits that
we have in the human computer
interaction community more to computing
as well as to more broadly in in
people's lives and I'd like to shout
out especially to to deals and to get
Elizabeth mentioned receive the award
about four years ago and spoke at CSCW
and was co chair with me it tougher
papers almost twenty years ago. So what
I'd like to do in the talk today is
talk about large scale behavioural log
data and both some of the amazing
opportunities as well as some of that
challenges and limitations of of these
the rise of web services over the last
decade has made it possible to gather
traces of human behaviour in C to ask
people are working in their natural
environments at a scale infidelity that
is was previously unimaginable this is
really transformed how web based
systems are designed evaluated and
improved. So they're amazing
opportunities as as well as challenges
here using examples from web search all
talk about two kinds of of lots of
highlight how using observational logs
can provide a rich new lands on to the
diversity of the people tasks and
interaction strategies that we see in
the web. And also talk about how
experimental laws can transform how we
design and evaluate web systems. I'll
also talk at the end about some of the
challenges and and limitations. So to
highlight the importance of this
emerging new way of knowing if you will
I'd like to step back in time twenty
years. So twenty years ago in web
search and the web and I in fact in
itself was really nice and the NCSA
mosaic graphical browser was less than
two years old. And modern web search
engines for less than a year old and if
you remember most yeah this is this is
an older crowd I like ninety ninety
five had an online presence sort of
minimalist new times roman and there's
a really interesting highlight here
that says to view the conference a
class you need a graphical browser okay
the web was really nascent at in twenty
years ago I usually do this
interactively but that's a little hard
in this audience where I can see
anything the size of the web looking at
the number of top level domains was
about twenty two point seven thousand
okay twenty seven hundred websites like
ozone web crawler were two of the early
search engines that actually index the
full content of pages the one like
those release in the late ninety four
it in X fifty four thousand pages okay
and in fact it didn't index the full
text of them because firstly Mall then
was unsure about what the copyright
issues were in building a full
positional E a full content index with
all the positional information from
which you could reconstruct the full
page the times of really changed out
there's another interesting thing about
this this site it it's really wonderful
to about the internet archive and look
at these sites as they existed decades
ago there's a link to look at the top
five percent of the sites that you
could link and browse through the tent
outside of twenty five hundred of your
favourite web pages every day what I'm
what I think is most relevant to the
topic today is that behavioural all's
we're also a next to ten nonexistent so
there are like us receipt about a
thousand queries a day you now go
through an order of magnitude more than
that in a modern web search engine
every second then the reason for this
is that most search and most locking
was done on a client I started work at
Microsoft in nineteen ninety seven and
one of the first things that that
happened with the office help team stop
by my office is a fabulous you're here
we have trouble with office help we
hear that it's a little less than ideal
place a terribly and so my first
question was both what is what's wrong
order people searching for what are
they finding and they sort of shrugged
and so we don't really know and it's
not because they were bad they were bad
engineers they were bad human computer
interaction folks they literally didn't
know all the documentation was on the
clock on the client all searches were
done on the client and never sent
anyway right so when the move from
office ninety seven to office two
thousand was made search went online.
And that really transformed all sorts
of things the first of all there was a
they saw for the first time all sorts
of things that were people asking about
that they had no documentation for
things about I used to do this in
windows ninety seven or office ninety
seven how do I do it in office two
thousand it also really helped mitigate
some of the vocabulary mismatch
problems that Elizabeth alluded to in
that does Tom Tom twisting title
there's no place where'd vocabulary
mismatch between what the searchers
looking for and what the authors have
written about then online documentation
so they immediately saw people using
different words that authors of the the
the documents inmates dramatically with
no change in algorithms by
incorporating understanding in
incorporate user behaviour search
became better today we're not for
different world there are billions of
web sites trillions of pages indexed by
search engines billions of searches in
clicks everyday. It's nothing short of
startling the the magnitude just
startling and the fact that it works
most of the time is nothing short of
amazing searches really I think over
the last decade been transformed from
an arcane scale that that library
scientists are computer geeks that's to
something that absolutely everybody in
the world does every day there's a
tremendous diversity of people using
search and the tasks they look for
increasingly these we use it not just
to find information. But you buy things
too to look for medical information to
plant travel to monitor current events
it really is a core fabric of our
everyday lives. And it's also very
pervasive on the talk today about web
search but searches much broader it
happens on the desktop and the
enterprise and apps the one place where
I think it's lacking I was complaining
that they're Russell about this the
other day is in the real world was
walking through a a store and I wanted
to find granola bars and my left
fingers were twitching I kept trying to
take control elf that I like get rid of
the granola bars I think there's still
a a wasted to go here. But in these the
times work searches and transform from
something that very very few people did
a thousand people a day using the the
most popular search engine on the web
the something that that billions of
people do everyday makes it more and
more important to understand and
support searchers then then ever
before. So what are behavioural on
their traces of human behaviour through
but ever lands or whatever's sensors we
have in the physical world. We've all
seen books that fall open to the page
we've been to before my statistics
books if I open them up fall right open
to some complicated task test like
never get right sometimes they're more
intentionally might dog or a page you
might write annotations and highlight
put some margin ill yeah and you might
do much richer right annotation this
has all sorts of equations written out
as well as as highlighting it happens
not just the documents but in the
physical world this is a path you can
see the path that the engineers design.
And the path that that people take and
sometimes these trails could be much
more favourable such as footprints in
the sand in the case of web search the
trails that we see are search queries
the results what people click on how
the reformulated queries how long they
spend on pages so a lot of other maybe
talking today is driven by these
behavioural logs what's important about
behavioural also I think is that
they're actual behaviour people try to
accomplish tasks in their natural
environment. They're not recall
behaviours they're not subjective
impressions of what went on yesterday
they're not control laboratory tasks
and behavioural laws can be used in
lots of ways of detail these shortly.
They can be used a lot behind the
scenes to make it easier to find the
relevant information but they can also
be reflected in the interfaces of
various kinds this is a us screenshot
of a system called edit where read
where the gym holland and colleagues
developed in I guess it was a kind
nineteen ninety two on the left you see
a normal scroll bar with the region of
focus highlighted. And to the right of
that you see a lot of scroll bars that
are annotated by where people have
either written or edited the document
it so that the interaction with the
document that happens anyway is
reflected in an artifact that people
can use to better navigate a more
recent example of some lovely work by
adam for a and colleagues on something
called intertwine. And here the idea
was to take browsers and feature rich
applications and try to provide a kind
of inter application sent in this
particular a screenshot what he's done
is for somebody who searched a lot on a
on a tough technical documentation in
this case gimp. He's highlighted what
it what people did in the application
with the search query there that return
it so here's it's about transforming
some picture and integrate scale and
you can see that in their actual search
results. So really interesting way of
try to bridge across multiple
applications with behavioural trails.
Maybe the most common example of of
these behavioural odds is the auto
completion in search here I typed and
chi two thousand fifteen and you can
see all sorts of other things that
people typed after that one of them
that I was a little surprised to see
was rebuttals that turns out to be
something that people really care about
in in time. I and here's a example of
my personal search logs that in in
being that I can use for personal
reflection order refund I guess it's
not so personal another shared it with
some of my closest friends and and the
audience but I did look for comedy as
well as meeting planning and and use
yesterday I want to put large scale
behavioural logs in the context of a
much broader set of research methods
that that we used this is a gross
oversimplification if I don't mention
your favourite method I realise that
there are a multitude multitude of them
and that they're all interesting
important what I wanna do is its focus
on three times in in this talk the
first is is lab studies these typically
involve hundreds and tens or hundreds
of people often the tasks to known
sometimes participants bring their own
tasks they were some of the beauty of
lab studies is that they can involve
very detailed instrumentation video I
gay is a screen capture people can
speak aloud you get a very in some
sense a very thick it's trail of
people's behaviour although for a
limited set of tasks you can evaluate
systems that don't that you write
evaluator experimental systems or even
systems that don't exist if you use
wizard of oz techniques. So they're
wonderful in their richness and depth.
They don't cover the breath of
experiences that I think that modern
day web search services need to cover
their also panel studies you sort of
are in between a large scale logs and
and lab studies these often involve
hundreds or thousands of people who
download specific client side software
that allows people to monitor or to
record behaviour it cannot you could
also pro about specific tasks or
specific conditions that are of
interest what's important about these
in contrast to lab studies is that
people are using there is a search
systems in their computer in ways that
they normally do so you get a much
richer sense of tasks and activities
that people are performing in the wild
and the last level of analysis I wanna
talk about our log studies these as I
mentioned before in the case of web
search services involve millions of
people and tasks there's a tremendous
diversity a really eye opening
diversity of what people are doing with
search engines and what kinds of things
we need to to support here there's an
abundance of data. But is very thin and
it's not it's noisy and it's not
labelled I'll talk about some of those
challenges later. So in this range of
experimental methods from lab studies
the panels a large scale logs one
highlight that one can do observational
studies as well as experimental
studies. So in the lab you can just
look at what people are doing not
compared to systems in panels we do it
through ethnography or case studies for
things like Nielsen studies. And then
the laws we often look at at traces of
behaviour through tweeter and we keep
ED for example where we're not there's
no experiment at least on the observers
parted involved but we can also do
experience at controlled experiments at
all of these scales in the laboratory
it's obvious in pal studies there were
clinical trials and feel trials. And
the lifeblood of many web search
systems is the ability to do carefully
controlled experiments. And understand
the fact that those have in realistic
settings. So the observational data
allows us to build sort of an abstract
picture of behaviour the sort of things
that we need to focus on and design.
And then experiments are really to help
us decide whether one approach is is
better than than another I think we
tend to think about experiments is
happening carefully controlled
experiments just happening in the lab
and observation relations happening it
web a scale. But I think this slide
illustrates that regardless of the
scale of the experiment you can observe
in rich detail. And you can do careful
experimentation what I'm gonna focus on
today is this the lower light here a
looking both that observational and
experiment a lot studies the the
benefits of large scale behavioural
also are over high like there are many
of them I wanna highlight three of them
one is that they're real world there a
portrait of behaviour in a while
they're not imagine tasks that acid
actually actually happen in a lab there
are lots of things that people will not
do you don't see a lot of four and you
don't see a lot of very personal
queries even if you let people generate
their own tasks you don't see repeat
behaviour you don't see people issue in
the same query over and over again
they'll switch to a new task these are
all thing personal cool a personal
searches repeat searches are all things
are really but it and in the world and
so we don't see those and laboratory
studies there also large scale we can
see millions of people and tasks it
even rarer behaviours can be examined
so if if you get a billion searches a
day you can observe a one in a million
occurrence a thousand times during the
course of the day okay and as I've
mentioned before one of the things that
that strikes you over the head what you
look at how systems are being used and
in the wild is that there's just
tremendous diversity in what people are
seeking and how there is trying to
satisfy those information needs it's
what about Chris Anderson has called
the long tail of information needs
search logs are behavioural odds are
also real time maybe not as real time
estimator. But the feedback about what
people are doing often reflect the
events happening in the real world is
pretty immediate this is that the query
distribution the frequency of queries
for the query flew over time two or
three years ago there was a huge bike
during the hopefully with the swine flu
but you can see yearly year peeks
around wintertime if here this you can
also see events this is caught the
query gyro copter that didn't happen
very much at all over the last five or
six years. But for a few days ago when
somebody or last week I guess one
somebody fluid gyro copter onto the USA
Lot of the US capital people all of a
sudden started asking about it okay so
I these are these are these are some of
the benefits of behavioural logs that
we can use to understand people and and
to improve systems. So I'll give you a
one slide tutorial on behavioural logs
and and web search the question web
searches how you go from on average two
or three queries anything at all
sensible but it is miraculous that it
works right the first stage of web
search involve matching contents of
matching queries the people issued to
content of web pages the next
generation involved also exploding link
structure understanding who link to
your page which pages you link to that
allows a retrieval engine to set non
uniform priors on web pages and in
retrieval in the last decade or so user
behaviour has been cop become very
important the anchor text we used to
point to web pages the query a click
trails that we leave query
reformulation you try something it
doesn't work you try again all of those
are important indicators of what people
are doing and well what's succeeding or
not as the case may be contextual
metadata is increasingly important we
know where queries are asked what time
it is and those are really important in
determining what's relevant if I ask
for you know I don't like US open two
thousand fifteen it this time if you're
probably wanna know about the golf
tournament which just happened in a few
months all want to know about the
tennis tournament. So all of this
contextual metadata is really important
in providing what's relevant to people
at the right time. So all of these
behavioural data are used to improve
web search and and algorithms and and
many ways let me just give you a couple
of examples. So how many of you ever
benefited from Spelling correction in a
web search right like we can't type
anymore. It's not that algorithms for
Spelling correction have gotten
markedly different it better over the
last decade what has changed what
dominates that is the amount of
information that we have if people type
of query retype a variant of it soon
thereafter to pretty good signal that
the first one either that the search
ended in return the right thing or that
it was it perhaps because it was a typo
we don't think twice about crazy
misspellings anymore in in search and
similarly as I just mentioned time and
location are really important in
helping interpret what people might
mean by a very short query the same
query in different locations are at
different points in time it should
probably be satisfied by different
results. So going back in time twenty
years there were lots of surprises in
looking at early web search slots web
search was nascent the intuitions that
people have about what would work in
what what was web search was was gonna
be used for what algorithms would work
we're really intuitions derived from
library search and so looking at the
early search slots provides really a
reality check on what people want to do
with this fabulous new tool not what
designers anticipated them doing. So
there were several early web search
slide analyses some through publicly
available data that up cutting a
provided from excite and a lot of it
from work the craig's overseen and
Andrei burger another's did it alta
vista the the thing that that was
incredibly striking in the early web
search logs is that web searches not
library search queries a short a of
people search for sex more so in ninety
seven then I nine and I just like
yesterday at the top two hundred
searches on being and search the
prevalence of set search query says
decrease markedly over the years in
part because there's other information
on the web about all sorts of things
navigation is a common behaviour so
people assumed that searchers would
look for information in much the same
way that you do in in libraries but a
lot of searches are really aimed it
getting you to location. So the most
common query in early search logs we'll
see in a minute I up with something
like yeah you or hotmail they were
really ways of providing easy
navigation decides rather than seeking
information queries are not independent
you find lots of queries are about a
particular task clustered at the same
time as I mentioned before there's a a
huge tale of information needs and and
strategies for for solving them this is
unique to web search verses library
search when I it's true for web search
versus desktop search things that you
think you know from the web don't
necessarily apply to the desktop the
same is true on looking at web search
versus mobile search a lot of the
assumptions that we go into systems
with turn out when you look at at what
behaviours are going on not to hold
someone to talk about the the diversity
really to give you a sense of how
important it is it excite logs what
provided that two point five million
queries this is it really nine o'clock
yes okay will speed up two point five
million queries but they had two
hundred and fifty of those queries
accounted for ten percent of the
traffic so yeah as a search providing
absolutely have to nail those the tail
on the other hand there were almost a
million of those that occurred exactly
once you also need to accommodate that
so it's as if distribution here are the
top ten queries in nineteen ninety nine
you can see them some of them are
navigational some of the things that
were available on the well poke poke
him on MP three and so on your queries
that occurred only once some of them
are Y two K issues some of them but
where these completely crazy things
like provide me the email of Paul Allen
is the Seattle seahawks owner you need
to solve those needs. And their whole
bunch that are more intermediate
frequency queries very overtime and
task there periodicity is daily
periodicity is weekly periodicity is
things that are trending. And events
that happened they're reflected in in
logs. There's also important individual
and task differences that we see in
logs. So cory like a CM awards. And
what we've been talking about this
earlier today if I guess you the query
chi two thousand fifteen right before
you can be pretty sure what I mean if
you wish the current query country
music you may have meant the academy of
country music awards which happened
last week and probably receive much
more press coverage if Garth Brooks ass
the query he probably means something
different than I do. Here's an example
but really clean example of a query
sites wanna go through some of the
kinds of things you can get by
observing if you look at the query
typology you can distinguish
informational from navigational queries
you can look at queries whose frequency
to various here by two thousand fifteen
is the most common query that's not
true in in the web as a whole you can
look at long term trends here's a
person who's queried a computational
social sciences social science every
sample times you can look at short term
tasks but these are actually several
queries I perform just the other day
looking at for the program and for
registration times. So using these
kinds of insights about types of
queries frequency of queries weepy
behaviour of queries. We can design a
and improve our ranking algorithms and
and interfaces all talk a little bit
more about that. Um most importantly we
can develop test sets that reflect real
behaviour rather than I imagined
behaviour right so I this is actually
very interesting example on a rush
through it but because I wanna mention.
So search for this is going to be a
using search laws as a lens for going
beyond web web server improving web
search per se search for health
information is incredibly calm and then
in important about eighty percent of US
adult search have searched the web for
medical information. Um one in two
hundred fifty people query about the
top one hundred prescription madison's
in the US some mining health search
data to identify things like adverse
drug effects are side effects is
something that that's possible today
these side effects and interactions are
detected a based on reports from
patients and conditions. It's a slow
it's a slow process and and not a very
rich one that was like a study in
reported in two thousand eleven looking
at the interactions of peroxide teen
which is an antidepressant drug. And
Provost that which is the cholesterol
reducing drug. Um researchers
discovered that these two drugs taken
in combination seem to lead to
hypertension. So Eric corvettes and
Ryan white decided to see whether they
could fine early warning signs of these
affects in in search lots. So they with
at search lots from free two thousand
eleven and found quite robust signals
that people who query on both of these
drugs have them much higher likelihood
of querying on side affects of hybrid
my senior things like a thirst or
increased appetite or high blood sugar
pesos starts laws can increase I think
the speed. And scale of detecting some
of these this isn't the end all of of
all of this they're validating it and
lots of other ways. But I think it
highlights the potential to identify
some of these things which can be
further study than in other ways the
skip this oh so I'll get back to it
shortly. So we can go from observations
two experiments observations as I said
before generate insights about
behaviour and ideas for a improving web
search but experiments are really the
lifeblood of search engines they are
way to assist systematically improved
improve search. They're used to improve
all sorts of aspects of web search
systems from X system latency people
notice differences of as little as
fifty milliseconds in the page load
time. And it influences a search
behaviour in a multitude of ways search
trail so used to influence ranking
algorithms to compare snippets snippets
or this very important. But under
looked I think oftentimes aspect of web
search there the way that you figure
out whether to to follow links morph
fully. They're used to support and
evaluate different Spelling in query
suggestion algorithms. And also which
your presentations how do we know
whether putting these rich answer boxes
on the right is is useful at all and
for what cases is it useful experiments
allow folks to become much more data
driven rather than people driven
radical Harvey has this brilliant actor
acronym hold hippo standing for the
highest paid person's opinion. And so
without data that's how things often
get arbitrator I'm not gonna talk I I'm
not gonna talk about how to conduct
experiments at at web scale it's in in
some ways it's very similar to
conducting experiments at smaller scale
I do want to point out that some things
are much easier scott's to study
experimentally than the others
algorithms you're the ranking algorithm
use behind the scenes is pretty easy to
study it doesn't influence user
behaviour other things like new
interface techniques are much harder
because they require changes and how we
interact with system. And social
systems are incredibly hard to study
because there are huge spillover
effects from a treatment that I might
receive to treatment that my friends
might see so the value or behavioural
lots I think comes from providing often
surprising insights about how people
interact with existing systems it
allows us to focus energy on supporting
actual verses presumed activities it
suggests experiments about important
and an unexpected behaviours any can
support a wide variety research
experiences in addition we can improve
search and a whole host of ways using
control laboratory systems. I really
this is transformed posture systems are
designed evaluated invented for however
logs are have some limits they camp
there large. But I'm label locks can
tell us what people really trying to do
whether they were successful what their
experiences for what they were
attending to the same behaviour can
mean many different things. If I don't
click on a web page is that good or bad
sounds bad but if the if there's this
and in line answer that shows me
exactly the the result I want like the
weather and sold today might be good
the experiments are are limited logs
are limited to existing systems. It's
actually logs tell us a lot a lot about
what and how people search but not very
much about why. And it's important to
complement logs with a variety of other
techniques to provide a much richer and
more complete picture of what people
are doing and and how do we advance
that I have several examples of this I
will was supposed to be and in ten
minutes okay I will go through such a
couple of these that not all the rest
of their and wrap up and leave time for
some questions. So an important thing
is to try to capture something about a
what the Y rather than just the what so
but a decade ago we built the system
call curious browser that captured it
was quite code that you downloaded it
captured a lot of implicit activities.
So what queries you're assuming what
queries we formulations where what
clicks or how long that well time was
and we also appropriate for explicit
judgements about the relevance of the
page and the success okay folks at the
control if you could plugin or find
another power source that would be
great yeah so would probe for relevance
or page or six session six six S of the
session yeah oh it's this it's not
working this can you run the slides
from the back but next point okay well
I'll describe but we did what we did
was if you visited a given a set of
search results if you visit their web
page. And oh okay let's see if it works
again a great visited a web page if you
like later came back to the search
engine or under a number of of other
conditions that we had in the state
machine we would probe whether that
search result was good. So so or or not
so good we also probably about the
session whether during the course of a
session you accomplish what you want it
it we Denmark models to predict based
on this plentiful but on labelled data
what the outcome of interest might be
and we found that using just the click
you could predict whether somebody is
gonna market pages relevant less than
fifty percent of the time so using just
clicks which search engines had been
using is it has a leave some room to be
improved if you incorporate clicks how
long you spent on a page so if you
spend very short time on a page it's
much less likely to be relevant that if
you spend a lot of time on it if you
look at other query formulations and
how the session and you can improve
accuracy pretty dramatically we also
could improve sessions excess almost
perfectly by knowing whether there was
one page that somebody had view during
the session that was relevant that
that's a a great. BI to great result at
some level but we didn't probably that
was the effort that it took whether it
might people's expectations of people
might a been satisfied. But it might
have taken them a heck of a lot longer
so we we're lacking something in in
these kinds of models. We've also done
these kinds of things for abandonment
looking at whether when people look at
a page that has a search result front
and centre in the page and they don't
click whether that's good or bad. And
again we had we did both retrospective
survey to understand why people hadn't
clicked on things in the past as well
as one of these in C two surveys and
again are able to develop models to
predict that we've talked previously in
JBT van and and others about the
importance of re finding in web search
we think of search as a way to discover
new information but often people wanna
refined as well as find information and
this is something that was obvious from
the web search logs I guess I'm done
that okay so re finding was something
that was discovered in large scale web
search laws. We were able to develop
new techniques that supported re
finding as well as as finding. So what
I try to to do in the last two examples
and then the the shortened re finding
examples is show you how we move back
and forth constantly between large
scale observations lab studies and then
AB testing in the wild to understand
whether ideas that are derived from
Marcia lobster actually relevant. Um
let me just wrap up by saying that what
I've tried to do today is present a
picture of a rich set of tools that we
can use to understand searcher
behaviour from lab studies to panels to
large scale logs they offer
complementary benefits and I think a
large scale laws are unique in the fact
that they cover real behaviour allow us
to see a tremendous diversity of tasks
that are really hard to get in any
other way. And our real time there were
a number of challenges as I've
highlighted but I think they have
offered tremendous potential. And I
thank you for your attention and and
time I also like to thank colleagues
from bell labs Bellcore and especially
folks who worked at an early version of
a tutorial we gave it a high on some of
these same topics stand Russell Jamie T
van a robin Jeffries and Diane tank. So
thank you for your attention Q so much
through. Thank you for wrangling the
technology. We can go through a few
questions okay okay alright many
companies use these logs Google
Microsoft amazon. But many concerns a
rising around the issue of privacy
except for that what you think could be
other threats for the user privacy is
certainly an important one and and one
that I think is appropriately discussed
in in this form I'd much rather talk
about some of the the the possible
issues I think you know the control and
transparency about what's what's being
recorded is is certainly important than
that some of the other threats I think
are that design needs to be both bottom
up data driven as well as top down sort
of design and inspired and I think it's
easy to get into the the mode where you
are totally driven by small scale
individual results without taking a
step back to think about broader design
implications I mean conversely you may
have a brilliant design but you need to
understand how how to how to to
evaluated whether it really is is
working I think the more you jump
outside the box and design radically
new systems the harder it is to
evaluate them with some of these
existing techniques it takes a lot of
time to familiarise yourself with the a
particular domain and to understand to
to understand what the behavioural
signals are are telling you you
mentioned lab experiments just for with
cameras becoming ubiquitous how do you
think I tracking and of the facial
expression detection techniques may
feed into future designs for the kinds
of information seeking find connect
people been looking at that's it that's
that's a very timely question because I
just spent the about a year and a half
looking at one what one can do in
designing you search experiences with
the assumption that things like eye
tracking will be broadly available
moving forward. And what were certainly
looking at rich multimodal kinds of of
interactions I think I tracking alone
is something that's absolutely critical
first some populations with
disabilities but the eyes fundamentally
are a sensor not an factor and so I
think we need to to view the eyes our
lord gaze tracking as an indication of
attention but then complemented with
the techniques one of the things we've
been doing is looking on large displays
at the focus of attention and then as
you can choose to to zoom zoom around
the focus of where you're looking not
around the centre of the display or
some other area so we're starting to
look at some of those experiences
wonderful okay so is that Google the
number one search on back I think I
gave the whole talk without using that
word I I I actually don't I don't
remember if I it is certainly in the
top under as is being I why I can't we
tell intent or goals from locks or do
you think you make sure we will be able
to I think the the same we can mean
very different things. Um depending on
who's issuing it when it is where they
are I think trying to the only way to
link behaviour with intent is to have
some label data there I think the and
we we try to do that the curious
browser model is one that tried to link
relevance intense are so varied I think
it'll be really hard to to build models
for all of those we we try to do the
the best we we can we are getting
better by looking at a much richer set
of signals and then just the query that
can increase don't fall from the sky
they're issued by real live human
beings at a particular point in time
and space. And understanding that that
metadata what previous queries in the
session previous queries in the longer
term I think is a way to get it that
but I think we missed a lot of the new
wants is that you find that when you
talk to people in about what they were
looking for you know somebody might
look for chi two thousand fifty might
be the query that somebody uses to get
information about the time that this
plan reasons happen it's gonna be hard
to detect that without right without
some subsequent interaction on the part
of the user right and that I think this
is gonna be the last question "'cause"
that you already mentioned some of the
work you're doing with you know methods
and also the cameras and so forth but
clearly some people in the audience you
want to sign up and work with you okay
so if you would design it new search
system and you had the resources in at
least half of this room what would you
ask us to do that yeah I the first so
we have internships and just jobs
available so that speak to me after we
can work to together. I think there
were several shortcomings of of current
search engines are in one is that we
don't do a very good job of supporting
tasks queries are treated not quite in
isolation but we don't use what you do
with the results of research is you
know right button pay seven copy them
to to something else we don't have a
way of providing a richer way of
organising material coming back
reinstating tasks when you come back to
them one of the the the examples I had
showed that sixty percent of its fifty
percent of time that people spent
searching isn't long sessions. So long
sessions don't happened often only five
percent of the time but people invest a
tremendous amount of time there and I
think better supporting that I'd like
us to also think more broadly about
proactive searching things that are or
we'll just surfaced when when you need
them I think mobile searching is
changing the way in which we think
about articulating our information days
it it they tend to be much longer and
more natural queries and more dialogue
might so I I think we are seeing some
transformations for me better
supporting tasks and making things more
pervasive in in ways that are more
proactive rather than reactive I think
a two really important directions
nothing okay they pretty please join me
in thanking the inspirational and
marvellous you do okay thank you thanks

Share this talk: 

Conference program

ACM-W Athena Lecture: Large-Scale Behavioral Data: Potential and Pitfalls
Susan Dumais, Distinguished Scientist, Microsoft and Deputy Managing Director, Microsoft Research Lab
23 April 2015 · 8:36 a.m.

Recommended talks

Small Devices for Big Impact
Nuria Oliver, PhD, Scientific Director Telefonica Research
19 Sept. 2013 · 3:05 p.m.
Crossing: HCI, Design and Sustainability
Lou Yongqi, Dean, College of Design and Innovation, Tongji University
20 April 2015 · 8:45 a.m.