17. Evaluation with Users
Now, test an interface with real users! (종강 전 마지막…)
- Users are human beings… not simulator
- Ethics
- Responsibilities!
Types of User testing
- Qualitative / Naturalistic
- Quantitative / Experimental
- Field study (완성도가 높아야 가능)
Participant standpoint
- Testing is a distressing experience.
- Pressure to perform
- Feeling inadequacy
- Looking like a fool
Milgrams’ Obedience Experiment
authority figure + social peer group이 70%이상이 영향을 준다는 연구.
→ 근데 윤리적이지가 못함
- Deceived participants.
- put them under more pressure than many believe was necessary
Was it useful? → Did we anything learned that can be broadly applied? 너 재미로 그냥 호기심에 한거 아냐?
Was it ethical? → 다른 윤리적인 방법이 없었을까? 상황을 다 설명해준다던지. 역할극을 해본다던지.
Treating Subjects with respect
Follow human subject protocols
- Individual test results should be
confidential
- Users can
stop
the test any time - Users are aware the
monitoring technique
- Their performance will not have
implication on their life
(승진 금지) - Records will be
anonymous
Use standard informed consent form
nnnb
Conducting Experiment
Before experiment
- Have them read and sign consent form
- Explain goal of experiment
During experiment
- Stay
neutral
- Never indicate displeasures (한숨쉬기. 에휴 너무 못쓰네 ㅉㅉ 금지)
After the experiment
- Debrief users
- Inform users about goal
- Answer any questions!
Managing Subjects
- Don’t waste users’ time
- Use
pilot tests
to debug experiments! - Have everything ready
- Use
- Make user comfortable
- Keep a relaxed atmosphere
- allow breaks
- pace tasks correctly
- Compensation
- Pay them!
Concerns of User Testing
Internal Validity
- Observed result by independent variables.
- Confidence in our explanation
- Usually good in experimental setting
- watch for
confounding
variables
다른 영향 없이, 해당 실험에서 수행 능력이 independent variable (p
External Validity
다른 환경, 다른 대상에게도 적용되는거야?
- Generalizability.
- confidence that results applies in real situation
→ These two has trade-off.
Reliability
이 실험 똑같은 상황에서 반복해도 똑같은 결과가 나올까?
Considerations on…
Internal Validity
- Ordering effect
- X먼저? Y 먼저?
- Learning effect!
- Get tired…
- Selection Bias
- 무작위로 고른 줄 알았는데, 나눈 집단에 bias가 있을 수 있음
- Experiment Bias
- 실험 수행자가 자기가 원하는 결과로 해석하는 경향성
→ Double-blind experiment (내 주제가 뭔지 모르게 conductor를 고용)
External Validity
- Population
- Target population을 잘 반영할만한 사람인가?
- Ecological Validity
- Real world와 환경 세팅이 얼마나 같은가?
- Training validity
- 너무 튜토리얼을 많이 알려준거 아닌가?
- Task validity
- 실험에서 진행한 task들이 실제로 사람들의 활동을 대표하는가?
Qualitative Evaluation
The raw data is non-numeric
data.
- Observations, video
- Open-ended interviews
- Narrative, textual description
→ We should focus on how good it is, richness and depth of data. (not reduction to numbers)
Grounded Theory approach
- Data-driven method for building theory from qualitative data
- Aim: generate new theory grounded in the data itself
Usability Study - Qualitative
목적
Understand the user’s perception
Emphasize the users’ ability to use the system
방법
- Introspection (cognitive walkthrough)
- Direct observation
- Interviews and questionnaires
일단 task를 만든다.
- End goals
- Specific and realistic
- Doable
- Not too long
Cognitive walkthrough (Introspection)
Designer tries the system out (without users)
- Completely Subjective
- Designer is non-typical user
Direct Observation
Observing users interacting with system
- Good for identifying gross design/interface problems
→ need to code
Three approaches
- Simple observation
- Think-aloud
- Constructive interaction
- Simple observation
- Evaluator observes!
- Drawbacks
- No insight into the user’s decision process or attitude
- Think aloud
- Subject asked to say what they are thinking
- Widely used
- Drawbacks
- Awkward for subject, not natural
- Thinking about it may alter the way people perform their task
- Hard to talk when they are concentrating.
- Constructive Interaction Method
- Two people work together on a task
- Normal conversation between the two user is monitored (less distortion)
- Removes awkwardness of think-aloud
- Co-discovery learning
- Use coach and naive subject together
- Make naive subject use the interface
- Drawbacks
- Need good team!
Interviews
- Pick the right population
- Be prepared
- Probe more deeply on interesting issues (focus on goals)
Pros
- Very good at directing next design phase
Cons
- Subjective (leading questions)
- Time-consuming
Debriefing
- Post-observation interviews
Pros
- Avoid errorneous reconstruction
Cons
- Time-consuming
Questionnaires, Survey
- Pick population
- Establish purpose
- Establish means
- Design questionnaires, (with debug)
- Deliver
Pros
- Can reach a large population
- As good as the questions asked (질문이 좋으면 답도 좋다)
Cons
- Preparation is expensive
- Data collection can be tedious
Closed Question
Supply possible answers
- Easy to analyze
- Make it more difficult to respondent
- Be sure to be specific
→ Make sure to pick odd numbers!
Ex)
- Scalar (1~5) (odd number)
- Multi-choice (Can be exclusive, or not..)
- Ranked choice (Helpful for preference)
Open-ended Questions
Answers in his or her own words
- Good for general information
- Difficult to analysis
- Can complement closed question
So, What is outcome?
- High-level effect
- Taks flow problems - 흐름 상의 문제
- Task description problems - 제대로 설명 못한 부분?
- Contextual findings - 한손 사용 등 맥락적인 요소
Pros
- Apply to real situation
- Good external validity
Cons
- Poor internal validity
- Poor control of independent(predictor) variables
- Often subject Data