MaxDiff Analysis of MSBA Class Preferences

Counts, maximum likelihood, and Bayesian MNL estimates

Estimating relative preferences for ten MSBA core classes using Best-Worst Scaling, MNL maximum likelihood, and a Metropolis-Hastings Bayesian model.

Author

Kelun Wang

Published

May 26, 2026

Introduction

MaxDiff, also called Best-Worst Scaling, is a survey method for learning relative preferences. Instead of asking people to rate every option on a 1-to-10 scale, it asks them to make trade-offs. On each screen, a respondent sees a small set of items and chooses the one they like most and the one they like least.

That setup is useful here because people often use rating scales differently. One student might call almost everything a 9, while another student saves 9s for rare favorites. MaxDiff pushes respondents to compare the options directly, so the data reveal which classes win when they are placed next to other classes.

In this assignment, the items are the 10 core MSBA classes at UCSD Rady. I estimate class preferences three ways: a simple counts score, a multinomial logit model fit by maximum likelihood, and the same MNL model fit with a Bayesian Metropolis-Hastings sampler. I expect the methods to agree on the top and bottom classes, with more movement in the middle where preferences are closer together.

The Data

The dataset contains 85 respondents. Each respondent completed 15 MaxDiff tasks, each task showed 4 classes, and the study included 10 total classes.

Tasks	4-item tasks	One best	One worst	Two neutral	All valid?
MaxDiff Task Structure Check
680	680	680	680	680	TRUE

The sanity check passes: every task has exactly one best choice, one worst choice, and two unselected items. The exposure counts are also balanced. Every item was shown either 323 or 595 times, so differences in preference are not being driven by one class appearing much more often than another.

Item	Class	Times shown
Exposure by Class
1	MGTA 403 AI Math, Prog., & Analytics (Summer with Nijs)	332
2	MGTA 464 SQL (Summer with Nijs)	340
3	MGTA 451 Marketing/Finance/Operations (Summer with Wilbur/Buti/Shin)	326
4	MGTA 452 Large Data (Fall with Hansen)	338
5	MGTA 453 Business Analytics (Fall with August)	327
6	MGTA 444 Analytics Consulting (Winter with Peterson)	323
7	MGTA 455 Customer Analytics (Winter with Nijs)	335
8	MGTA 454 Capstone Project (Spring with Advisor)	330
9	MGTA 495 Marketing Analytics (Spring with Yavorsky)	330
10	MGTA 457 Biz. Intelligence Systems (Fall with Schibler)	334
11	NA	595

Counts Analysis

The counts analysis is the fastest way to summarize the survey. For each class, I calculate the percentage of appearances where it was selected as best and subtract the percentage of appearances where it was selected as worst:

\[ \text{score}_j = \%\text{best}_j - \%\text{worst}_j \]

A positive score means the class is chosen as best more often than worst. A negative score means the opposite. This method is descriptive and easy to explain, though it does not fully model the choice process.

Item	Class	Shown	Best	Worst	% best	% worst	Score
Counts Scores by Class
9	MGTA 495 Marketing Analytics (Spring with Yavorsky)	330	199	12	60.3%	3.6%	0.567
5	MGTA 453 Business Analytics (Fall with August)	327	143	55	43.7%	16.8%	0.269
7	MGTA 455 Customer Analytics (Winter with Nijs)	335	155	71	46.3%	21.2%	0.251
11	NA	595	135	0	22.7%	0.0%	0.227
4	MGTA 452 Large Data (Fall with Hansen)	338	122	60	36.1%	17.8%	0.183
8	MGTA 454 Capstone Project (Spring with Advisor)	330	124	69	37.6%	20.9%	0.167
2	MGTA 464 SQL (Summer with Nijs)	340	111	56	32.6%	16.5%	0.162
10	MGTA 457 Biz. Intelligence Systems (Fall with Schibler)	334	74	55	22.2%	16.5%	0.057
1	MGTA 403 AI Math, Prog., & Analytics (Summer with Nijs)	332	76	90	22.9%	27.1%	-0.042
3	MGTA 451 Marketing/Finance/Operations (Summer with Wilbur/Buti/Shin)	326	75	101	23.0%	31.0%	-0.080
6	MGTA 444 Analytics Consulting (Winter with Peterson)	323	61	111	18.9%	34.4%	-0.155

The counts method ranks MGTA 495 Marketing Analytics highest and MGTA 444 Analytics Consulting lowest. The middle of the ranking is closer together, which is exactly where I would expect small sampling variation to change the order.

From MaxDiff Data to MNL Choices

For the MNL model, each task becomes two choice observations. First, the respondent chooses the best class from the four shown. If the shown set is ({a,b,c,d}), the probability of choosing class (b) as best is:

\[ P(\text{best}=b) = \frac{\exp(\beta_b)} {\exp(\beta_a)+\exp(\beta_b)+\exp(\beta_c)+\exp(\beta_d)} \]

Second, after the best class is removed, the respondent chooses the worst class from the remaining three. Since low-utility items are more likely to be chosen as worst, the utilities are flipped:

\[ P(\text{worst}=c \mid \text{best}=b) = \frac{\exp(-\beta_c)} {\exp(-\beta_a)+\exp(-\beta_c)+\exp(-\beta_d)} \]

Only relative utilities are identified. Adding the same constant to every () does not change any soft-max probabilities. I set item 1, MGTA 403, to (_1 = 0), and estimate the other nine utilities relative to that reference class.

MNL via Maximum Likelihood

The log-likelihood sums the best-choice probability and the worst-choice probability over all tasks:

\[ \ell(\boldsymbol{\beta}) = \sum_{\text{tasks}} \left[ \beta_{j_b^*} - \log \sum_{j \in \text{shown}} \exp(\beta_j) - \beta_{j_w^*} - \log \sum_{j \in \text{shown}\setminus\{j_b^*\}} \exp(-\beta_j) \right] \]

I maximized this likelihood with optim() using BFGS and used the inverse negative Hessian for standard errors. To make the estimates easier to read, I also convert utilities into shares of preference:

\[ \hat{s}_j = \frac{\exp(\hat{\beta}_j)}{\sum_{k=1}^{10}\exp(\hat{\beta}_k)} \]

These shares answer a simple hypothetical question: if all 10 classes appeared on one screen, what share of choices would each class receive?

Item	Class	MLE beta	SE	Preference share
MNL Maximum Likelihood Estimates
9	MGTA 495 Marketing Analytics (Spring with Yavorsky)	1.630	0.146	28.4%
5	MGTA 453 Business Analytics (Fall with August)	0.744	0.142	11.7%
7	MGTA 455 Customer Analytics (Winter with Nijs)	0.656	0.142	10.7%
4	MGTA 452 Large Data (Fall with Hansen)	0.555	0.140	9.7%
8	MGTA 454 Capstone Project (Spring with Advisor)	0.443	0.140	8.7%
2	MGTA 464 SQL (Summer with Nijs)	0.438	0.137	8.6%
10	MGTA 457 Biz. Intelligence Systems (Fall with Schibler)	0.236	0.137	7.0%
1	MGTA 403 AI Math, Prog., & Analytics (Summer with Nijs)	0.000	Reference	5.6%
3	MGTA 451 Marketing/Finance/Operations (Summer with Wilbur/Buti/Shin)	-0.067	0.139	5.2%
6	MGTA 444 Analytics Consulting (Winter with Peterson)	-0.239	0.141	4.4%

The MLE model ranks MGTA 495 Marketing Analytics first and MGTA 444 Analytics Consulting last. Compared with counts, the broad story is similar, but the MNL uses more structure: it accounts for the specific competitors present in each task and treats not being selected as information.

MNL via Bayesian Estimation

The Bayesian model uses the same likelihood, but combines it with a weakly informative Normal prior:

\[ \boldsymbol{\beta} \sim N(\boldsymbol{0}, 10I) \]

I used a random-walk Metropolis-Hastings sampler. The proposal step size was tuned to 0.07, which produced an acceptance rate of 34.9%. I ran 20,000 iterations and discarded the first 5,000 as burn-in.

Item	Class	Posterior mean beta	95% credible interval	Posterior mean share
Bayesian MNL Estimates
9	MGTA 495 Marketing Analytics (Spring with Yavorsky)	1.617	[1.323, 1.917]	28.3%
5	MGTA 453 Business Analytics (Fall with August)	0.731	[0.470, 1.018]	11.7%
7	MGTA 455 Customer Analytics (Winter with Nijs)	0.647	[0.386, 0.916]	10.8%
4	MGTA 452 Large Data (Fall with Hansen)	0.543	[0.265, 0.812]	9.7%
2	MGTA 464 SQL (Summer with Nijs)	0.431	[0.167, 0.705]	8.7%
8	MGTA 454 Capstone Project (Spring with Advisor)	0.429	[0.159, 0.710]	8.6%
10	MGTA 457 Biz. Intelligence Systems (Fall with Schibler)	0.216	[-0.034, 0.463]	7.0%
1	MGTA 403 AI Math, Prog., & Analytics (Summer with Nijs)	0.000	[0.000, 0.000]	5.6%
3	MGTA 451 Marketing/Finance/Operations (Summer with Wilbur/Buti/Shin)	-0.075	[-0.341, 0.224]	5.2%
6	MGTA 444 Analytics Consulting (Winter with Peterson)	-0.254	[-0.510, 0.022]	4.4%

The Bayesian estimates are very close to the MLE estimates, which is reassuring because the prior is weak and the dataset is fairly large. The main advantage is interpretability of uncertainty: once I have posterior draws, credible intervals for shares are easy to calculate directly.

Comparing the Three Methods

Item	Class	Counts score	Counts rank	MLE share	MLE rank	Bayes share	Bayes rank
Side-by-Side Ranking Comparison
9	MGTA 495 Marketing Analytics (Spring with Yavorsky)	0.567	1	28.4%	1	28.3%	1
5	MGTA 453 Business Analytics (Fall with August)	0.269	2	11.7%	2	11.7%	2
7	MGTA 455 Customer Analytics (Winter with Nijs)	0.251	3	10.7%	3	10.8%	3
4	MGTA 452 Large Data (Fall with Hansen)	0.183	5	9.7%	4	9.7%	4
8	MGTA 454 Capstone Project (Spring with Advisor)	0.167	6	8.7%	5	8.6%	6
2	MGTA 464 SQL (Summer with Nijs)	0.162	7	8.6%	6	8.7%	5
10	MGTA 457 Biz. Intelligence Systems (Fall with Schibler)	0.057	8	7.0%	7	7.0%	7
1	MGTA 403 AI Math, Prog., & Analytics (Summer with Nijs)	-0.042	9	5.6%	8	5.6%	8
3	MGTA 451 Marketing/Finance/Operations (Summer with Wilbur/Buti/Shin)	-0.080	10	5.2%	9	5.2%	9
6	MGTA 444 Analytics Consulting (Winter with Peterson)	-0.155	11	4.4%	10	4.4%	10
11	NA	0.227	4	NA	NA	NA	NA

The three methods agree most strongly at the extremes. MGTA 495 Marketing Analytics is the top Bayesian class, and MGTA 444 Analytics Consulting is the bottom Bayesian class; these line up closely with the counts and MLE results. The disagreements are mostly in the middle, where several classes have similar scores and small differences can flip the order.

Counts are useful for a quick dashboard because the score is transparent. MLE adds a formal choice model and produces standard errors for the estimated utilities. Bayes adds an even more flexible uncertainty workflow, especially for derived quantities like preference shares.

Discussion

Substantively, the survey suggests that students have clearer enthusiasm for the top analytics and systems-oriented classes than for the lower-ranked required classes. I would be careful not to overstate the exact middle ranking, because the uncertainty bands overlap for several options.

For a non-technical audience, I would show the counts result first because it is intuitive: how often did each class win versus lose when shown? For a formal report, I would use the MLE MNL estimates because they match the structure of the MaxDiff task. For planning simulations or future model extensions, I would prefer the Bayesian version because posterior draws make uncertainty on any quantity easier to summarize.

The biggest caveat is external validity. These results come from a self-selected sample of MSBA students, and preferences may depend on students’ prior work experience, current quarter, instructors, and how the course names were presented. If I had more time, I would estimate individual-level preferences or segment students by background to see whether the overall ranking hides different taste groups.