Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
Supporting and advocating
00:00:01
internationally for the full engagement
00:00:04
of women in all aspects of the
00:00:06
computing field ACMW is most
00:00:09
prestigious award is the Athena
00:00:11
lecturer award this celebrates women
00:00:14
researches with make fundamental
00:00:16
contributions to computer science each
00:00:19
year a CMW on as one woman as the if
00:00:22
you know lecturer this year's lecture a
00:00:25
recipient of the lecture award is Susan
00:00:28
sue today as many of you already know
00:00:32
in Greek mythology Athena is the
00:00:34
goddess of wisdom courage inspiration
00:00:38
civilisation lower injustice
00:00:40
mathematics strength the arts crafts
00:00:45
skill and more strategy I think that's
00:00:51
his favourite I think I think Nancy for
00:00:53
a very long time and haven't been lucky
00:00:55
enough to have had support and advice
00:00:57
from her in my own career. I can
00:00:59
honestly say she has many Athena like
00:01:01
qualities and traits. She's the string
00:01:04
of crafted takes to piss you difficult
00:01:06
problems in HCI in computer science and
00:01:09
to communicate those those ideas with
00:01:12
great I like and it's she was born and
00:01:15
raised in that was nine to gotta be a
00:01:17
in mathematics and psychology from a
00:01:19
liberal arts college. She went on to
00:01:22
graduate with a PHD in cognitive
00:01:23
psychology from Indiana university just
00:01:27
made a really strong contributed to our
00:01:29
high community for very many as she
00:01:32
started out her curry a bell labs and
00:01:34
bellcore. And she in nineteen ninety
00:01:36
seven went to Microsoft research where
00:01:39
she has buildings stellar
00:01:40
groundbreaking research and also acting
00:01:43
as an adjunct professor at the
00:01:44
university of Washington she presented
00:01:48
to paper at the very first a CM chi
00:01:50
conference in nineteen eighty three
00:01:53
this paper was called using examples to
00:01:55
just gripe categories. She also has the
00:01:58
distinction of presenting a very
00:02:00
memorable paper a with a very hard
00:02:02
because title at the high nineteen
00:02:05
ninety two because the meeting I'm
00:02:08
gonna try and say yeah on these
00:02:10
statistical semantics how can a
00:02:12
computer use what people named things
00:02:15
to guess what people mean when they
00:02:17
name things. She was clearly ahead of
00:02:21
the times and the times that she was
00:02:23
already struggling with the heart
00:02:26
problems that she is now really
00:02:28
bringing big solutions to what we call
00:02:30
personalisation in search and
00:02:33
information we keep retrieval this
00:02:36
theme runs throughout sues research.
00:02:39
And she's always been very
00:02:40
interdisciplinary and how to use
00:02:42
ascended perspective shakes a very
00:02:44
broad approach and really tries to
00:02:47
bring uses in everything she does.
00:02:50
She's been recognised in the HCI
00:02:52
community the IR community and the web
00:02:54
sciences community for her research.
00:02:57
And how awards are equally broad. And
00:03:00
well does the she was inducted into the
00:03:02
pile academy in two thousand and five
00:03:05
recognises a CM following two thousand
00:03:07
six should receive the same guy a
00:03:09
Gerhard sultan award for lifetime
00:03:11
achievement in two thousand and nine
00:03:13
and was inducted into the national
00:03:15
academy of engineering in two thousand
00:03:17
eleven to receive that only constructs
00:03:19
award in two thousand and fourteen. So
00:03:23
I'm very happy that she's also now that
00:03:25
the you know lecturer. She's been a
00:03:27
tireless contribution in terms of
00:03:28
service that high as well. She's been
00:03:31
involved and number program committees.
00:03:33
She's run a number of doctoral
00:03:35
consumption and she has mended many
00:03:38
people informally. She injury also and
00:03:41
with a quite PC coach as in nineteen
00:03:43
ninety four and at that time some of
00:03:46
you remember submissions what on paper
00:03:48
Paper remember that remembers state
00:03:52
that I was somewhat note that do Dawson
00:03:56
was recognised as an at the lecture in
00:03:58
two thousand and eleven. So this
00:04:00
incredibly powerful team have continued
00:04:03
to be influential and happily at being
00:04:05
recognised. So is the first person from
00:04:08
industry to be recognised is the if you
00:04:11
know lecturer. And she does illustrates
00:04:13
it's possible to do amazing research
00:04:15
inspired generations of researches
00:04:18
ideas as well as younger researchers.
00:04:21
And influence products that have an
00:04:23
effect on the lives of millions is is a
00:04:26
really well deserved award a sue and
00:04:29
it's a great achievement ask the HCI
00:04:31
community. I would really like you all
00:04:34
to congratulate and welcome suited Meta
00:04:37
the stage. Thank you. yeah excellent
00:04:43
okay thank you Elizabeth for the the
00:04:50
wonderful introduction I'm deeply
00:04:57
honoured by by the support especially
00:04:59
since the nomination was by my peers
00:05:02
both in HCINI R.s Elizabeth noted I
00:05:05
really value the importance over broad
00:05:07
interdisciplinary perspective an attack
00:05:09
attacking problems. And it's it's
00:05:11
wonderful to see this recognise. But
00:05:14
it. It's not just an award for me it's
00:05:16
really an award for everybody in this
00:05:18
room into the broader high community is
00:05:21
a list beside a CM awards one of these
00:05:23
a year and so I I take it as a
00:05:24
recognition by a CM of the importance
00:05:28
of the kinds of corporate pursuits that
00:05:31
we have in the human computer
00:05:32
interaction community more to computing
00:05:35
as well as to more broadly in in
00:05:37
people's lives and I'd like to shout
00:05:39
out especially to to deals and to get
00:05:42
Elizabeth mentioned receive the award
00:05:44
about four years ago and spoke at CSCW
00:05:47
and was co chair with me it tougher
00:05:49
papers almost twenty years ago. So what
00:05:52
I'd like to do in the talk today is
00:05:55
talk about large scale behavioural log
00:05:57
data and both some of the amazing
00:06:00
opportunities as well as some of that
00:06:03
challenges and limitations of of these
00:06:06
the rise of web services over the last
00:06:08
decade has made it possible to gather
00:06:10
traces of human behaviour in C to ask
00:06:13
people are working in their natural
00:06:15
environments at a scale infidelity that
00:06:18
is was previously unimaginable this is
00:06:21
really transformed how web based
00:06:23
systems are designed evaluated and
00:06:26
improved. So they're amazing
00:06:28
opportunities as as well as challenges
00:06:30
here using examples from web search all
00:06:32
talk about two kinds of of lots of
00:06:34
highlight how using observational logs
00:06:37
can provide a rich new lands on to the
00:06:40
diversity of the people tasks and
00:06:44
interaction strategies that we see in
00:06:45
the web. And also talk about how
00:06:49
experimental laws can transform how we
00:06:52
design and evaluate web systems. I'll
00:06:56
also talk at the end about some of the
00:06:57
challenges and and limitations. So to
00:07:01
highlight the importance of this
00:07:03
emerging new way of knowing if you will
00:07:06
I'd like to step back in time twenty
00:07:10
years. So twenty years ago in web
00:07:13
search and the web and I in fact in
00:07:16
itself was really nice and the NCSA
00:07:20
mosaic graphical browser was less than
00:07:22
two years old. And modern web search
00:07:25
engines for less than a year old and if
00:07:27
you remember most yeah this is this is
00:07:31
an older crowd I like ninety ninety
00:07:37
five had an online presence sort of
00:07:39
minimalist new times roman and there's
00:07:44
a really interesting highlight here
00:07:46
that says to view the conference a
00:07:48
class you need a graphical browser okay
00:07:53
the web was really nascent at in twenty
00:07:57
years ago I usually do this
00:08:00
interactively but that's a little hard
00:08:03
in this audience where I can see
00:08:04
anything the size of the web looking at
00:08:07
the number of top level domains was
00:08:08
about twenty two point seven thousand
00:08:11
okay twenty seven hundred websites like
00:08:15
ozone web crawler were two of the early
00:08:18
search engines that actually index the
00:08:19
full content of pages the one like
00:08:23
those release in the late ninety four
00:08:26
it in X fifty four thousand pages okay
00:08:30
and in fact it didn't index the full
00:08:31
text of them because firstly Mall then
00:08:34
was unsure about what the copyright
00:08:36
issues were in building a full
00:08:38
positional E a full content index with
00:08:42
all the positional information from
00:08:44
which you could reconstruct the full
00:08:45
page the times of really changed out
00:08:48
there's another interesting thing about
00:08:50
this this site it it's really wonderful
00:08:52
to about the internet archive and look
00:08:54
at these sites as they existed decades
00:08:56
ago there's a link to look at the top
00:08:58
five percent of the sites that you
00:09:00
could link and browse through the tent
00:09:02
outside of twenty five hundred of your
00:09:04
favourite web pages every day what I'm
00:09:08
what I think is most relevant to the
00:09:10
topic today is that behavioural all's
00:09:13
we're also a next to ten nonexistent so
00:09:16
there are like us receipt about a
00:09:18
thousand queries a day you now go
00:09:20
through an order of magnitude more than
00:09:22
that in a modern web search engine
00:09:24
every second then the reason for this
00:09:27
is that most search and most locking
00:09:29
was done on a client I started work at
00:09:32
Microsoft in nineteen ninety seven and
00:09:35
one of the first things that that
00:09:36
happened with the office help team stop
00:09:38
by my office is a fabulous you're here
00:09:40
we have trouble with office help we
00:09:42
hear that it's a little less than ideal
00:09:45
place a terribly and so my first
00:09:49
question was both what is what's wrong
00:09:51
order people searching for what are
00:09:52
they finding and they sort of shrugged
00:09:54
and so we don't really know and it's
00:09:57
not because they were bad they were bad
00:10:01
engineers they were bad human computer
00:10:04
interaction folks they literally didn't
00:10:06
know all the documentation was on the
00:10:08
clock on the client all searches were
00:10:10
done on the client and never sent
00:10:11
anyway right so when the move from
00:10:14
office ninety seven to office two
00:10:16
thousand was made search went online.
00:10:19
And that really transformed all sorts
00:10:21
of things the first of all there was a
00:10:23
they saw for the first time all sorts
00:10:25
of things that were people asking about
00:10:27
that they had no documentation for
00:10:29
things about I used to do this in
00:10:30
windows ninety seven or office ninety
00:10:32
seven how do I do it in office two
00:10:33
thousand it also really helped mitigate
00:10:38
some of the vocabulary mismatch
00:10:40
problems that Elizabeth alluded to in
00:10:42
that does Tom Tom twisting title
00:10:45
there's no place where'd vocabulary
00:10:49
mismatch between what the searchers
00:10:51
looking for and what the authors have
00:10:53
written about then online documentation
00:10:55
so they immediately saw people using
00:10:57
different words that authors of the the
00:11:02
the documents inmates dramatically with
00:11:04
no change in algorithms by
00:11:06
incorporating understanding in
00:11:07
incorporate user behaviour search
00:11:08
became better today we're not for
00:11:11
different world there are billions of
00:11:13
web sites trillions of pages indexed by
00:11:15
search engines billions of searches in
00:11:17
clicks everyday. It's nothing short of
00:11:20
startling the the magnitude just
00:11:22
startling and the fact that it works
00:11:24
most of the time is nothing short of
00:11:26
amazing searches really I think over
00:11:29
the last decade been transformed from
00:11:32
an arcane scale that that library
00:11:34
scientists are computer geeks that's to
00:11:36
something that absolutely everybody in
00:11:38
the world does every day there's a
00:11:41
tremendous diversity of people using
00:11:44
search and the tasks they look for
00:11:46
increasingly these we use it not just
00:11:49
to find information. But you buy things
00:11:53
too to look for medical information to
00:11:59
plant travel to monitor current events
00:12:02
it really is a core fabric of our
00:12:04
everyday lives. And it's also very
00:12:06
pervasive on the talk today about web
00:12:07
search but searches much broader it
00:12:09
happens on the desktop and the
00:12:11
enterprise and apps the one place where
00:12:13
I think it's lacking I was complaining
00:12:14
that they're Russell about this the
00:12:16
other day is in the real world was
00:12:17
walking through a a store and I wanted
00:12:20
to find granola bars and my left
00:12:22
fingers were twitching I kept trying to
00:12:24
take control elf that I like get rid of
00:12:26
the granola bars I think there's still
00:12:27
a a wasted to go here. But in these the
00:12:31
times work searches and transform from
00:12:33
something that very very few people did
00:12:35
a thousand people a day using the the
00:12:37
most popular search engine on the web
00:12:39
the something that that billions of
00:12:40
people do everyday makes it more and
00:12:44
more important to understand and
00:12:45
support searchers then then ever
00:12:47
before. So what are behavioural on
00:12:53
their traces of human behaviour through
00:12:55
but ever lands or whatever's sensors we
00:12:58
have in the physical world. We've all
00:13:01
seen books that fall open to the page
00:13:03
we've been to before my statistics
00:13:05
books if I open them up fall right open
00:13:08
to some complicated task test like
00:13:10
never get right sometimes they're more
00:13:12
intentionally might dog or a page you
00:13:15
might write annotations and highlight
00:13:17
put some margin ill yeah and you might
00:13:20
do much richer right annotation this
00:13:23
has all sorts of equations written out
00:13:25
as well as as highlighting it happens
00:13:27
not just the documents but in the
00:13:28
physical world this is a path you can
00:13:30
see the path that the engineers design.
00:13:32
And the path that that people take and
00:13:35
sometimes these trails could be much
00:13:37
more favourable such as footprints in
00:13:39
the sand in the case of web search the
00:13:42
trails that we see are search queries
00:13:45
the results what people click on how
00:13:47
the reformulated queries how long they
00:13:49
spend on pages so a lot of other maybe
00:13:51
talking today is driven by these
00:13:54
behavioural logs what's important about
00:13:57
behavioural also I think is that
00:14:00
they're actual behaviour people try to
00:14:02
accomplish tasks in their natural
00:14:03
environment. They're not recall
00:14:05
behaviours they're not subjective
00:14:07
impressions of what went on yesterday
00:14:09
they're not control laboratory tasks
00:14:11
and behavioural laws can be used in
00:14:18
lots of ways of detail these shortly.
00:14:22
They can be used a lot behind the
00:14:23
scenes to make it easier to find the
00:14:25
relevant information but they can also
00:14:27
be reflected in the interfaces of
00:14:29
various kinds this is a us screenshot
00:14:32
of a system called edit where read
00:14:34
where the gym holland and colleagues
00:14:37
developed in I guess it was a kind
00:14:39
nineteen ninety two on the left you see
00:14:41
a normal scroll bar with the region of
00:14:43
focus highlighted. And to the right of
00:14:45
that you see a lot of scroll bars that
00:14:48
are annotated by where people have
00:14:50
either written or edited the document
00:14:53
it so that the interaction with the
00:14:55
document that happens anyway is
00:14:57
reflected in an artifact that people
00:14:59
can use to better navigate a more
00:15:03
recent example of some lovely work by
00:15:05
adam for a and colleagues on something
00:15:08
called intertwine. And here the idea
00:15:10
was to take browsers and feature rich
00:15:12
applications and try to provide a kind
00:15:18
of inter application sent in this
00:15:20
particular a screenshot what he's done
00:15:22
is for somebody who searched a lot on a
00:15:25
on a tough technical documentation in
00:15:27
this case gimp. He's highlighted what
00:15:30
it what people did in the application
00:15:33
with the search query there that return
00:15:35
it so here's it's about transforming
00:15:37
some picture and integrate scale and
00:15:38
you can see that in their actual search
00:15:40
results. So really interesting way of
00:15:42
try to bridge across multiple
00:15:44
applications with behavioural trails.
00:15:47
Maybe the most common example of of
00:15:49
these behavioural odds is the auto
00:15:51
completion in search here I typed and
00:15:54
chi two thousand fifteen and you can
00:15:56
see all sorts of other things that
00:15:57
people typed after that one of them
00:15:59
that I was a little surprised to see
00:16:00
was rebuttals that turns out to be
00:16:02
something that people really care about
00:16:04
in in time. I and here's a example of
00:16:09
my personal search logs that in in
00:16:12
being that I can use for personal
00:16:13
reflection order refund I guess it's
00:16:15
not so personal another shared it with
00:16:17
some of my closest friends and and the
00:16:19
audience but I did look for comedy as
00:16:20
well as meeting planning and and use
00:16:22
yesterday I want to put large scale
00:16:29
behavioural logs in the context of a
00:16:31
much broader set of research methods
00:16:33
that that we used this is a gross
00:16:34
oversimplification if I don't mention
00:16:37
your favourite method I realise that
00:16:39
there are a multitude multitude of them
00:16:40
and that they're all interesting
00:16:42
important what I wanna do is its focus
00:16:43
on three times in in this talk the
00:16:47
first is is lab studies these typically
00:16:50
involve hundreds and tens or hundreds
00:16:52
of people often the tasks to known
00:16:54
sometimes participants bring their own
00:16:56
tasks they were some of the beauty of
00:17:00
lab studies is that they can involve
00:17:02
very detailed instrumentation video I
00:17:05
gay is a screen capture people can
00:17:08
speak aloud you get a very in some
00:17:10
sense a very thick it's trail of
00:17:14
people's behaviour although for a
00:17:16
limited set of tasks you can evaluate
00:17:18
systems that don't that you write
00:17:20
evaluator experimental systems or even
00:17:23
systems that don't exist if you use
00:17:24
wizard of oz techniques. So they're
00:17:26
wonderful in their richness and depth.
00:17:29
They don't cover the breath of
00:17:30
experiences that I think that modern
00:17:32
day web search services need to cover
00:17:34
their also panel studies you sort of
00:17:36
are in between a large scale logs and
00:17:39
and lab studies these often involve
00:17:41
hundreds or thousands of people who
00:17:44
download specific client side software
00:17:49
that allows people to monitor or to
00:17:53
record behaviour it cannot you could
00:17:56
also pro about specific tasks or
00:17:59
specific conditions that are of
00:18:02
interest what's important about these
00:18:04
in contrast to lab studies is that
00:18:07
people are using there is a search
00:18:10
systems in their computer in ways that
00:18:12
they normally do so you get a much
00:18:13
richer sense of tasks and activities
00:18:16
that people are performing in the wild
00:18:17
and the last level of analysis I wanna
00:18:22
talk about our log studies these as I
00:18:24
mentioned before in the case of web
00:18:26
search services involve millions of
00:18:28
people and tasks there's a tremendous
00:18:31
diversity a really eye opening
00:18:33
diversity of what people are doing with
00:18:36
search engines and what kinds of things
00:18:38
we need to to support here there's an
00:18:40
abundance of data. But is very thin and
00:18:44
it's not it's noisy and it's not
00:18:46
labelled I'll talk about some of those
00:18:48
challenges later. So in this range of
00:18:53
experimental methods from lab studies
00:18:56
the panels a large scale logs one
00:18:58
highlight that one can do observational
00:19:01
studies as well as experimental
00:19:02
studies. So in the lab you can just
00:19:04
look at what people are doing not
00:19:06
compared to systems in panels we do it
00:19:09
through ethnography or case studies for
00:19:11
things like Nielsen studies. And then
00:19:13
the laws we often look at at traces of
00:19:17
behaviour through tweeter and we keep
00:19:20
ED for example where we're not there's
00:19:22
no experiment at least on the observers
00:19:25
parted involved but we can also do
00:19:28
experience at controlled experiments at
00:19:30
all of these scales in the laboratory
00:19:32
it's obvious in pal studies there were
00:19:34
clinical trials and feel trials. And
00:19:37
the lifeblood of many web search
00:19:39
systems is the ability to do carefully
00:19:42
controlled experiments. And understand
00:19:44
the fact that those have in realistic
00:19:46
settings. So the observational data
00:19:49
allows us to build sort of an abstract
00:19:51
picture of behaviour the sort of things
00:19:52
that we need to focus on and design.
00:19:55
And then experiments are really to help
00:19:57
us decide whether one approach is is
00:20:00
better than than another I think we
00:20:04
tend to think about experiments is
00:20:06
happening carefully controlled
00:20:07
experiments just happening in the lab
00:20:09
and observation relations happening it
00:20:11
web a scale. But I think this slide
00:20:14
illustrates that regardless of the
00:20:16
scale of the experiment you can observe
00:20:18
in rich detail. And you can do careful
00:20:21
experimentation what I'm gonna focus on
00:20:23
today is this the lower light here a
00:20:25
looking both that observational and
00:20:27
experiment a lot studies the the
00:20:32
benefits of large scale behavioural
00:20:34
also are over high like there are many
00:20:36
of them I wanna highlight three of them
00:20:38
one is that they're real world there a
00:20:39
portrait of behaviour in a while
00:20:42
they're not imagine tasks that acid
00:20:44
actually actually happen in a lab there
00:20:47
are lots of things that people will not
00:20:49
do you don't see a lot of four and you
00:20:51
don't see a lot of very personal
00:20:53
queries even if you let people generate
00:20:56
their own tasks you don't see repeat
00:20:58
behaviour you don't see people issue in
00:20:59
the same query over and over again
00:21:01
they'll switch to a new task these are
00:21:04
all thing personal cool a personal
00:21:10
searches repeat searches are all things
00:21:13
are really but it and in the world and
00:21:14
so we don't see those and laboratory
00:21:16
studies there also large scale we can
00:21:21
see millions of people and tasks it
00:21:24
even rarer behaviours can be examined
00:21:27
so if if you get a billion searches a
00:21:29
day you can observe a one in a million
00:21:31
occurrence a thousand times during the
00:21:33
course of the day okay and as I've
00:21:36
mentioned before one of the things that
00:21:38
that strikes you over the head what you
00:21:40
look at how systems are being used and
00:21:42
in the wild is that there's just
00:21:44
tremendous diversity in what people are
00:21:46
seeking and how there is trying to
00:21:48
satisfy those information needs it's
00:21:51
what about Chris Anderson has called
00:21:53
the long tail of information needs
00:21:54
search logs are behavioural odds are
00:21:57
also real time maybe not as real time
00:22:00
estimator. But the feedback about what
00:22:03
people are doing often reflect the
00:22:04
events happening in the real world is
00:22:06
pretty immediate this is that the query
00:22:09
distribution the frequency of queries
00:22:11
for the query flew over time two or
00:22:13
three years ago there was a huge bike
00:22:15
during the hopefully with the swine flu
00:22:19
but you can see yearly year peeks
00:22:22
around wintertime if here this you can
00:22:25
also see events this is caught the
00:22:28
query gyro copter that didn't happen
00:22:30
very much at all over the last five or
00:22:32
six years. But for a few days ago when
00:22:35
somebody or last week I guess one
00:22:37
somebody fluid gyro copter onto the USA
00:22:40
Lot of the US capital people all of a
00:22:44
sudden started asking about it okay so
00:22:49
I these are these are these are some of
00:22:51
the benefits of behavioural logs that
00:22:53
we can use to understand people and and
00:22:55
to improve systems. So I'll give you a
00:22:57
one slide tutorial on behavioural logs
00:22:59
and and web search the question web
00:23:01
searches how you go from on average two
00:23:03
or three queries anything at all
00:23:05
sensible but it is miraculous that it
00:23:08
works right the first stage of web
00:23:11
search involve matching contents of
00:23:15
matching queries the people issued to
00:23:17
content of web pages the next
00:23:19
generation involved also exploding link
00:23:22
structure understanding who link to
00:23:24
your page which pages you link to that
00:23:27
allows a retrieval engine to set non
00:23:29
uniform priors on web pages and in
00:23:32
retrieval in the last decade or so user
00:23:36
behaviour has been cop become very
00:23:38
important the anchor text we used to
00:23:40
point to web pages the query a click
00:23:43
trails that we leave query
00:23:45
reformulation you try something it
00:23:47
doesn't work you try again all of those
00:23:49
are important indicators of what people
00:23:52
are doing and well what's succeeding or
00:23:54
not as the case may be contextual
00:23:58
metadata is increasingly important we
00:24:00
know where queries are asked what time
00:24:03
it is and those are really important in
00:24:05
determining what's relevant if I ask
00:24:07
for you know I don't like US open two
00:24:12
thousand fifteen it this time if you're
00:24:14
probably wanna know about the golf
00:24:15
tournament which just happened in a few
00:24:17
months all want to know about the
00:24:18
tennis tournament. So all of this
00:24:20
contextual metadata is really important
00:24:23
in providing what's relevant to people
00:24:26
at the right time. So all of these
00:24:29
behavioural data are used to improve
00:24:38
web search and and algorithms and and
00:24:40
many ways let me just give you a couple
00:24:41
of examples. So how many of you ever
00:24:43
benefited from Spelling correction in a
00:24:46
web search right like we can't type
00:24:48
anymore. It's not that algorithms for
00:24:50
Spelling correction have gotten
00:24:52
markedly different it better over the
00:24:54
last decade what has changed what
00:24:57
dominates that is the amount of
00:24:58
information that we have if people type
00:25:00
of query retype a variant of it soon
00:25:03
thereafter to pretty good signal that
00:25:06
the first one either that the search
00:25:07
ended in return the right thing or that
00:25:09
it was it perhaps because it was a typo
00:25:12
we don't think twice about crazy
00:25:14
misspellings anymore in in search and
00:25:17
similarly as I just mentioned time and
00:25:19
location are really important in
00:25:21
helping interpret what people might
00:25:23
mean by a very short query the same
00:25:25
query in different locations are at
00:25:27
different points in time it should
00:25:31
probably be satisfied by different
00:25:34
results. So going back in time twenty
00:25:40
years there were lots of surprises in
00:25:43
looking at early web search slots web
00:25:47
search was nascent the intuitions that
00:25:50
people have about what would work in
00:25:52
what what was web search was was gonna
00:25:54
be used for what algorithms would work
00:25:57
we're really intuitions derived from
00:25:58
library search and so looking at the
00:26:00
early search slots provides really a
00:26:02
reality check on what people want to do
00:26:05
with this fabulous new tool not what
00:26:06
designers anticipated them doing. So
00:26:09
there were several early web search
00:26:11
slide analyses some through publicly
00:26:14
available data that up cutting a
00:26:16
provided from excite and a lot of it
00:26:18
from work the craig's overseen and
00:26:20
Andrei burger another's did it alta
00:26:22
vista the the thing that that was
00:26:26
incredibly striking in the early web
00:26:28
search logs is that web searches not
00:26:30
library search queries a short a of
00:26:36
people search for sex more so in ninety
00:26:38
seven then I nine and I just like
00:26:41
yesterday at the top two hundred
00:26:43
searches on being and search the
00:26:46
prevalence of set search query says
00:26:48
decrease markedly over the years in
00:26:50
part because there's other information
00:26:51
on the web about all sorts of things
00:26:53
navigation is a common behaviour so
00:26:57
people assumed that searchers would
00:27:00
look for information in much the same
00:27:02
way that you do in in libraries but a
00:27:04
lot of searches are really aimed it
00:27:06
getting you to location. So the most
00:27:09
common query in early search logs we'll
00:27:14
see in a minute I up with something
00:27:15
like yeah you or hotmail they were
00:27:20
really ways of providing easy
00:27:22
navigation decides rather than seeking
00:27:25
information queries are not independent
00:27:28
you find lots of queries are about a
00:27:30
particular task clustered at the same
00:27:32
time as I mentioned before there's a a
00:27:34
huge tale of information needs and and
00:27:40
strategies for for solving them this is
00:27:43
unique to web search verses library
00:27:45
search when I it's true for web search
00:27:48
versus desktop search things that you
00:27:50
think you know from the web don't
00:27:51
necessarily apply to the desktop the
00:27:53
same is true on looking at web search
00:27:56
versus mobile search a lot of the
00:27:58
assumptions that we go into systems
00:28:01
with turn out when you look at at what
00:28:04
behaviours are going on not to hold
00:28:06
someone to talk about the the diversity
00:28:07
really to give you a sense of how
00:28:10
important it is it excite logs what
00:28:17
provided that two point five million
00:28:19
queries this is it really nine o'clock
00:28:23
yes okay will speed up two point five
00:28:27
million queries but they had two
00:28:33
hundred and fifty of those queries
00:28:35
accounted for ten percent of the
00:28:36
traffic so yeah as a search providing
00:28:40
absolutely have to nail those the tail
00:28:44
on the other hand there were almost a
00:28:46
million of those that occurred exactly
00:28:47
once you also need to accommodate that
00:28:49
so it's as if distribution here are the
00:28:51
top ten queries in nineteen ninety nine
00:28:55
you can see them some of them are
00:28:56
navigational some of the things that
00:28:57
were available on the well poke poke
00:28:59
him on MP three and so on your queries
00:29:02
that occurred only once some of them
00:29:05
are Y two K issues some of them but
00:29:07
where these completely crazy things
00:29:09
like provide me the email of Paul Allen
00:29:11
is the Seattle seahawks owner you need
00:29:14
to solve those needs. And their whole
00:29:16
bunch that are more intermediate
00:29:17
frequency queries very overtime and
00:29:20
task there periodicity is daily
00:29:22
periodicity is weekly periodicity is
00:29:25
things that are trending. And events
00:29:28
that happened they're reflected in in
00:29:31
logs. There's also important individual
00:29:35
and task differences that we see in
00:29:38
logs. So cory like a CM awards. And
00:29:41
what we've been talking about this
00:29:43
earlier today if I guess you the query
00:29:45
chi two thousand fifteen right before
00:29:47
you can be pretty sure what I mean if
00:29:50
you wish the current query country
00:29:52
music you may have meant the academy of
00:29:54
country music awards which happened
00:29:55
last week and probably receive much
00:29:57
more press coverage if Garth Brooks ass
00:30:00
the query he probably means something
00:30:01
different than I do. Here's an example
00:30:04
but really clean example of a query
00:30:08
sites wanna go through some of the
00:30:09
kinds of things you can get by
00:30:11
observing if you look at the query
00:30:13
typology you can distinguish
00:30:15
informational from navigational queries
00:30:17
you can look at queries whose frequency
00:30:21
to various here by two thousand fifteen
00:30:23
is the most common query that's not
00:30:24
true in in the web as a whole you can
00:30:27
look at long term trends here's a
00:30:29
person who's queried a computational
00:30:30
social sciences social science every
00:30:32
sample times you can look at short term
00:30:35
tasks but these are actually several
00:30:38
queries I perform just the other day
00:30:39
looking at for the program and for
00:30:41
registration times. So using these
00:30:45
kinds of insights about types of
00:30:47
queries frequency of queries weepy
00:30:49
behaviour of queries. We can design a
00:30:52
and improve our ranking algorithms and
00:30:54
and interfaces all talk a little bit
00:30:56
more about that. Um most importantly we
00:30:59
can develop test sets that reflect real
00:31:03
behaviour rather than I imagined
00:31:05
behaviour right so I this is actually
00:31:11
very interesting example on a rush
00:31:14
through it but because I wanna mention.
00:31:16
So search for this is going to be a
00:31:17
using search laws as a lens for going
00:31:20
beyond web web server improving web
00:31:21
search per se search for health
00:31:24
information is incredibly calm and then
00:31:26
in important about eighty percent of US
00:31:28
adult search have searched the web for
00:31:31
medical information. Um one in two
00:31:34
hundred fifty people query about the
00:31:36
top one hundred prescription madison's
00:31:38
in the US some mining health search
00:31:41
data to identify things like adverse
00:31:44
drug effects are side effects is
00:31:46
something that that's possible today
00:31:48
these side effects and interactions are
00:31:52
detected a based on reports from
00:31:54
patients and conditions. It's a slow
00:31:57
it's a slow process and and not a very
00:32:00
rich one that was like a study in
00:32:03
reported in two thousand eleven looking
00:32:05
at the interactions of peroxide teen
00:32:07
which is an antidepressant drug. And
00:32:10
Provost that which is the cholesterol
00:32:12
reducing drug. Um researchers
00:32:15
discovered that these two drugs taken
00:32:17
in combination seem to lead to
00:32:19
hypertension. So Eric corvettes and
00:32:22
Ryan white decided to see whether they
00:32:24
could fine early warning signs of these
00:32:27
affects in in search lots. So they with
00:32:30
at search lots from free two thousand
00:32:32
eleven and found quite robust signals
00:32:37
that people who query on both of these
00:32:40
drugs have them much higher likelihood
00:32:42
of querying on side affects of hybrid
00:32:45
my senior things like a thirst or
00:32:47
increased appetite or high blood sugar
00:32:49
pesos starts laws can increase I think
00:32:52
the speed. And scale of detecting some
00:32:55
of these this isn't the end all of of
00:32:57
all of this they're validating it and
00:32:58
lots of other ways. But I think it
00:33:00
highlights the potential to identify
00:33:02
some of these things which can be
00:33:04
further study than in other ways the
00:33:09
skip this oh so I'll get back to it
00:33:11
shortly. So we can go from observations
00:33:14
two experiments observations as I said
00:33:15
before generate insights about
00:33:17
behaviour and ideas for a improving web
00:33:20
search but experiments are really the
00:33:22
lifeblood of search engines they are
00:33:24
way to assist systematically improved
00:33:27
improve search. They're used to improve
00:33:29
all sorts of aspects of web search
00:33:31
systems from X system latency people
00:33:35
notice differences of as little as
00:33:37
fifty milliseconds in the page load
00:33:39
time. And it influences a search
00:33:43
behaviour in a multitude of ways search
00:33:46
trail so used to influence ranking
00:33:48
algorithms to compare snippets snippets
00:33:53
or this very important. But under
00:33:56
looked I think oftentimes aspect of web
00:33:58
search there the way that you figure
00:33:59
out whether to to follow links morph
00:34:02
fully. They're used to support and
00:34:04
evaluate different Spelling in query
00:34:06
suggestion algorithms. And also which
00:34:08
your presentations how do we know
00:34:10
whether putting these rich answer boxes
00:34:12
on the right is is useful at all and
00:34:14
for what cases is it useful experiments
00:34:17
allow folks to become much more data
00:34:20
driven rather than people driven
00:34:22
radical Harvey has this brilliant actor
00:34:24
acronym hold hippo standing for the
00:34:26
highest paid person's opinion. And so
00:34:29
without data that's how things often
00:34:31
get arbitrator I'm not gonna talk I I'm
00:34:35
not gonna talk about how to conduct
00:34:37
experiments at at web scale it's in in
00:34:40
some ways it's very similar to
00:34:42
conducting experiments at smaller scale
00:34:45
I do want to point out that some things
00:34:48
are much easier scott's to study
00:34:50
experimentally than the others
00:34:51
algorithms you're the ranking algorithm
00:34:54
use behind the scenes is pretty easy to
00:34:55
study it doesn't influence user
00:34:57
behaviour other things like new
00:34:59
interface techniques are much harder
00:35:01
because they require changes and how we
00:35:03
interact with system. And social
00:35:06
systems are incredibly hard to study
00:35:08
because there are huge spillover
00:35:09
effects from a treatment that I might
00:35:11
receive to treatment that my friends
00:35:12
might see so the value or behavioural
00:35:16
lots I think comes from providing often
00:35:19
surprising insights about how people
00:35:21
interact with existing systems it
00:35:23
allows us to focus energy on supporting
00:35:26
actual verses presumed activities it
00:35:29
suggests experiments about important
00:35:32
and an unexpected behaviours any can
00:35:36
support a wide variety research
00:35:37
experiences in addition we can improve
00:35:40
search and a whole host of ways using
00:35:42
control laboratory systems. I really
00:35:44
this is transformed posture systems are
00:35:46
designed evaluated invented for however
00:35:49
logs are have some limits they camp
00:35:53
there large. But I'm label locks can
00:35:57
tell us what people really trying to do
00:35:59
whether they were successful what their
00:36:01
experiences for what they were
00:36:03
attending to the same behaviour can
00:36:06
mean many different things. If I don't
00:36:09
click on a web page is that good or bad
00:36:11
sounds bad but if the if there's this
00:36:16
and in line answer that shows me
00:36:17
exactly the the result I want like the
00:36:19
weather and sold today might be good
00:36:21
the experiments are are limited logs
00:36:27
are limited to existing systems. It's
00:36:31
actually logs tell us a lot a lot about
00:36:33
what and how people search but not very
00:36:35
much about why. And it's important to
00:36:37
complement logs with a variety of other
00:36:40
techniques to provide a much richer and
00:36:42
more complete picture of what people
00:36:45
are doing and and how do we advance
00:36:48
that I have several examples of this I
00:36:51
will was supposed to be and in ten
00:36:55
minutes okay I will go through such a
00:37:00
couple of these that not all the rest
00:37:02
of their and wrap up and leave time for
00:37:04
some questions. So an important thing
00:37:07
is to try to capture something about a
00:37:11
what the Y rather than just the what so
00:37:14
but a decade ago we built the system
00:37:16
call curious browser that captured it
00:37:19
was quite code that you downloaded it
00:37:23
captured a lot of implicit activities.
00:37:25
So what queries you're assuming what
00:37:27
queries we formulations where what
00:37:29
clicks or how long that well time was
00:37:32
and we also appropriate for explicit
00:37:34
judgements about the relevance of the
00:37:36
page and the success okay folks at the
00:37:40
control if you could plugin or find
00:37:43
another power source that would be
00:37:44
great yeah so would probe for relevance
00:37:52
or page or six session six six S of the
00:37:56
session yeah oh it's this it's not
00:38:03
working this can you run the slides
00:38:06
from the back but next point okay well
00:38:20
I'll describe but we did what we did
00:38:22
was if you visited a given a set of
00:38:25
search results if you visit their web
00:38:26
page. And oh okay let's see if it works
00:38:29
again a great visited a web page if you
00:38:33
like later came back to the search
00:38:34
engine or under a number of of other
00:38:36
conditions that we had in the state
00:38:38
machine we would probe whether that
00:38:40
search result was good. So so or or not
00:38:44
so good we also probably about the
00:38:47
session whether during the course of a
00:38:49
session you accomplish what you want it
00:38:51
it we Denmark models to predict based
00:38:53
on this plentiful but on labelled data
00:38:56
what the outcome of interest might be
00:38:58
and we found that using just the click
00:39:01
you could predict whether somebody is
00:39:03
gonna market pages relevant less than
00:39:05
fifty percent of the time so using just
00:39:08
clicks which search engines had been
00:39:09
using is it has a leave some room to be
00:39:13
improved if you incorporate clicks how
00:39:16
long you spent on a page so if you
00:39:17
spend very short time on a page it's
00:39:19
much less likely to be relevant that if
00:39:21
you spend a lot of time on it if you
00:39:23
look at other query formulations and
00:39:25
how the session and you can improve
00:39:26
accuracy pretty dramatically we also
00:39:30
could improve sessions excess almost
00:39:32
perfectly by knowing whether there was
00:39:34
one page that somebody had view during
00:39:37
the session that was relevant that
00:39:40
that's a a great. BI to great result at
00:39:44
some level but we didn't probably that
00:39:46
was the effort that it took whether it
00:39:48
might people's expectations of people
00:39:49
might a been satisfied. But it might
00:39:51
have taken them a heck of a lot longer
00:39:53
so we we're lacking something in in
00:39:55
these kinds of models. We've also done
00:39:57
these kinds of things for abandonment
00:39:59
looking at whether when people look at
00:40:02
a page that has a search result front
00:40:05
and centre in the page and they don't
00:40:06
click whether that's good or bad. And
00:40:08
again we had we did both retrospective
00:40:11
survey to understand why people hadn't
00:40:13
clicked on things in the past as well
00:40:15
as one of these in C two surveys and
00:40:17
again are able to develop models to
00:40:19
predict that we've talked previously in
00:40:24
JBT van and and others about the
00:40:28
importance of re finding in web search
00:40:30
we think of search as a way to discover
00:40:32
new information but often people wanna
00:40:33
refined as well as find information and
00:40:36
this is something that was obvious from
00:40:37
the web search logs I guess I'm done
00:40:43
that okay so re finding was something
00:40:45
that was discovered in large scale web
00:40:47
search laws. We were able to develop
00:40:50
new techniques that supported re
00:40:53
finding as well as as finding. So what
00:40:57
I try to to do in the last two examples
00:41:00
and then the the shortened re finding
00:41:03
examples is show you how we move back
00:41:06
and forth constantly between large
00:41:08
scale observations lab studies and then
00:41:11
AB testing in the wild to understand
00:41:16
whether ideas that are derived from
00:41:17
Marcia lobster actually relevant. Um
00:41:20
let me just wrap up by saying that what
00:41:26
I've tried to do today is present a
00:41:28
picture of a rich set of tools that we
00:41:32
can use to understand searcher
00:41:34
behaviour from lab studies to panels to
00:41:37
large scale logs they offer
00:41:39
complementary benefits and I think a
00:41:40
large scale laws are unique in the fact
00:41:44
that they cover real behaviour allow us
00:41:50
to see a tremendous diversity of tasks
00:41:52
that are really hard to get in any
00:41:55
other way. And our real time there were
00:41:59
a number of challenges as I've
00:42:01
highlighted but I think they have
00:42:03
offered tremendous potential. And I
00:42:05
thank you for your attention and and
00:42:07
time I also like to thank colleagues
00:42:09
from bell labs Bellcore and especially
00:42:11
folks who worked at an early version of
00:42:14
a tutorial we gave it a high on some of
00:42:16
these same topics stand Russell Jamie T
00:42:18
van a robin Jeffries and Diane tank. So
00:42:21
thank you for your attention Q so much
00:42:30
through. Thank you for wrangling the
00:42:32
technology. We can go through a few
00:42:35
questions okay okay alright many
00:42:41
companies use these logs Google
00:42:44
Microsoft amazon. But many concerns a
00:42:47
rising around the issue of privacy
00:42:50
except for that what you think could be
00:42:53
other threats for the user privacy is
00:43:02
certainly an important one and and one
00:43:04
that I think is appropriately discussed
00:43:08
in in this form I'd much rather talk
00:43:10
about some of the the the possible
00:43:13
issues I think you know the control and
00:43:16
transparency about what's what's being
00:43:18
recorded is is certainly important than
00:43:20
that some of the other threats I think
00:43:26
are that design needs to be both bottom
00:43:31
up data driven as well as top down sort
00:43:33
of design and inspired and I think it's
00:43:36
easy to get into the the mode where you
00:43:39
are totally driven by small scale
00:43:43
individual results without taking a
00:43:45
step back to think about broader design
00:43:47
implications I mean conversely you may
00:43:49
have a brilliant design but you need to
00:43:51
understand how how to how to to
00:43:58
evaluated whether it really is is
00:44:00
working I think the more you jump
00:44:02
outside the box and design radically
00:44:04
new systems the harder it is to
00:44:06
evaluate them with some of these
00:44:07
existing techniques it takes a lot of
00:44:10
time to familiarise yourself with the a
00:44:12
particular domain and to understand to
00:44:20
to understand what the behavioural
00:44:22
signals are are telling you you
00:44:25
mentioned lab experiments just for with
00:44:28
cameras becoming ubiquitous how do you
00:44:30
think I tracking and of the facial
00:44:33
expression detection techniques may
00:44:36
feed into future designs for the kinds
00:44:38
of information seeking find connect
00:44:40
people been looking at that's it that's
00:44:42
that's a very timely question because I
00:44:44
just spent the about a year and a half
00:44:46
looking at one what one can do in
00:44:49
designing you search experiences with
00:44:52
the assumption that things like eye
00:44:53
tracking will be broadly available
00:44:56
moving forward. And what were certainly
00:45:01
looking at rich multimodal kinds of of
00:45:04
interactions I think I tracking alone
00:45:06
is something that's absolutely critical
00:45:10
first some populations with
00:45:12
disabilities but the eyes fundamentally
00:45:16
are a sensor not an factor and so I
00:45:20
think we need to to view the eyes our
00:45:23
lord gaze tracking as an indication of
00:45:26
attention but then complemented with
00:45:28
the techniques one of the things we've
00:45:29
been doing is looking on large displays
00:45:32
at the focus of attention and then as
00:45:34
you can choose to to zoom zoom around
00:45:37
the focus of where you're looking not
00:45:39
around the centre of the display or
00:45:40
some other area so we're starting to
00:45:42
look at some of those experiences
00:45:43
wonderful okay so is that Google the
00:45:46
number one search on back I think I
00:45:50
gave the whole talk without using that
00:45:52
word I I I actually don't I don't
00:45:56
remember if I it is certainly in the
00:45:59
top under as is being I why I can't we
00:46:04
tell intent or goals from locks or do
00:46:07
you think you make sure we will be able
00:46:09
to I think the the same we can mean
00:46:17
very different things. Um depending on
00:46:20
who's issuing it when it is where they
00:46:22
are I think trying to the only way to
00:46:31
link behaviour with intent is to have
00:46:33
some label data there I think the and
00:46:37
we we try to do that the curious
00:46:39
browser model is one that tried to link
00:46:41
relevance intense are so varied I think
00:46:45
it'll be really hard to to build models
00:46:47
for all of those we we try to do the
00:46:49
the best we we can we are getting
00:46:51
better by looking at a much richer set
00:46:54
of signals and then just the query that
00:46:56
can increase don't fall from the sky
00:46:58
they're issued by real live human
00:47:00
beings at a particular point in time
00:47:02
and space. And understanding that that
00:47:05
metadata what previous queries in the
00:47:08
session previous queries in the longer
00:47:10
term I think is a way to get it that
00:47:12
but I think we missed a lot of the new
00:47:13
wants is that you find that when you
00:47:15
talk to people in about what they were
00:47:17
looking for you know somebody might
00:47:19
look for chi two thousand fifty might
00:47:21
be the query that somebody uses to get
00:47:23
information about the time that this
00:47:26
plan reasons happen it's gonna be hard
00:47:28
to detect that without right without
00:47:31
some subsequent interaction on the part
00:47:33
of the user right and that I think this
00:47:35
is gonna be the last question "'cause"
00:47:36
that you already mentioned some of the
00:47:37
work you're doing with you know methods
00:47:40
and also the cameras and so forth but
00:47:42
clearly some people in the audience you
00:47:43
want to sign up and work with you okay
00:47:46
so if you would design it new search
00:47:48
system and you had the resources in at
00:47:52
least half of this room what would you
00:47:54
ask us to do that yeah I the first so
00:48:00
we have internships and just jobs
00:48:02
available so that speak to me after we
00:48:04
can work to together. I think there
00:48:08
were several shortcomings of of current
00:48:10
search engines are in one is that we
00:48:13
don't do a very good job of supporting
00:48:15
tasks queries are treated not quite in
00:48:20
isolation but we don't use what you do
00:48:24
with the results of research is you
00:48:27
know right button pay seven copy them
00:48:29
to to something else we don't have a
00:48:30
way of providing a richer way of
00:48:35
organising material coming back
00:48:37
reinstating tasks when you come back to
00:48:39
them one of the the the examples I had
00:48:42
showed that sixty percent of its fifty
00:48:46
percent of time that people spent
00:48:48
searching isn't long sessions. So long
00:48:50
sessions don't happened often only five
00:48:52
percent of the time but people invest a
00:48:54
tremendous amount of time there and I
00:48:57
think better supporting that I'd like
00:48:59
us to also think more broadly about
00:49:02
proactive searching things that are or
00:49:05
we'll just surfaced when when you need
00:49:07
them I think mobile searching is
00:49:08
changing the way in which we think
00:49:12
about articulating our information days
00:49:15
it it they tend to be much longer and
00:49:17
more natural queries and more dialogue
00:49:21
might so I I think we are seeing some
00:49:22
transformations for me better
00:49:24
supporting tasks and making things more
00:49:28
pervasive in in ways that are more
00:49:32
proactive rather than reactive I think
00:49:34
a two really important directions
00:49:35
nothing okay they pretty please join me
00:49:39
in thanking the inspirational and
00:49:41
marvellous you do okay thank you thanks

Share this talk: 


Conference Program

ACM-W Athena Lecture: Large-Scale Behavioral Data: Potential and Pitfalls
Susan Dumais, Distinguished Scientist, Microsoft and Deputy Managing Director, Microsoft Research Lab
23 April 2015 · 8:36 a.m.
101 views

Recommended talks

Small Devices for Big Impact
Nuria Oliver, PhD, Scientific Director Telefonica Research
19 Sept. 2013 · 3:05 p.m.
Futures Apps for Google Glasses
Xavier Guardia
28 Jan. 2014 · 11:21 a.m.