Revisiting Past Decisions: I’m Going to Try Watching Football Again

Over a year ago, I decided to stop watching football. To briefly summarize that thought process, I used to be a huge football fan, but after I started reflecting on the importance of going against one’s culture/interests to do what was “right” (e.g. the idea of a conservative supporting some basic gun safety laws) I decided I should walk the walk.

Football causes intense brain trauma to players, and that affects over a million athletes across high school, college, and professional football. The summer before I stopped watching, a study came out showing that of the 111 brains of former NFL players examined, 110 of them had CTE (99% for professional but also 88% of college players examined had CTE). Then factor in that poor kids are more likely to play football as their lottery ticket out plus the link between race and socioeconomic status, and the whole thing seems wrong.

So why would I return to watching football? Has any of that changed?

No, all of that is still true. But to me, that hasn’t been the whole universe of considerations. Really what it now comes down to for me is that football is a social connection. I’ve never been to Seattle or Jacksonville, but if I meet someone from there, talking about football is a good icebreaker. 

Is The Social Benefit Really Worth It?

I think so, yes. One major concern of mine is how so many people live in their bubbles and don’t have to confront good faith disagreements from other people. It makes it easy to strawman any disagreements, and it results in a political environment of yelling at each other while the wealthy and corporations get to consume all of the economic growth from the last 40 years. Geography (i.e. urban vs rural divides) and class already make this hard, and technology has only accelerated it. I want to try to interact with more “real” people (i.e. people that don’t all think like MIT PhD students), and I think that gets non-trivially hindered by my long list of things that I gave up. Navigating those spaces are already uncomfortable and hard, and common ground can be a very helpful starting point. So maybe it’s time for me to reassess.

Let’s say you and I are at dinner and you order a pitcher of beer for us. That’s when I awkwardly tell you that I don’t drink. We get our menus, and I decline splitting a pepperoni pizza because I don’t eat meat. I guess you probably feel like you shouldn’t tell that story about the amazing T-Bone that you ate last month. But it’s no problem! You pivot the conversation to the latest Antonio Brown drama. Sorry, once again, because I haven’t been watching football.

In reality, I actually try to superficially talk football when meeting new people and they bring it up (but I have now missed two offseasons worth of players moving around). But the basic idea still stands: People don’t want to feel judged. Everyone just wants to enjoy their thing without feeling like someone thinks they’re a bad person. Should every problem in the world be worth causing those little tiny social wrinkles in a large number of my social interactions?

What was the last thing you changed your mind about? I don’t know about you, but for me it’s not often  from some “other” convincing me I was wrong & dumb & needed to adopt their solution. No, instead it has usually tended to be someone who I could relate to telling me “Yeah I struggle with _____ and haven’t figured out what to do yet.” That’s what actually gets me thinking about it & reflecting on what I should do. I might not end up exactly where they do, but this give and take of “the right thing” needs trust and social connection in order for us to find out how to be our best selves.

Well Then Why Is It Important for Football But Not Vegetarianism?

Essentially, it boils down to my current belief that I can’t pull off all these hard lines at once. I don’t drink for personal and health reasons, but I gave up football & meat because they seemed like “the right thing to do.” Now I think I’m at a point where one of them needs to give. And if I have to pick which one stays, I pick vegetarianism. I think the scale of suffering for animals (who don’t have any say in the matter) is worse than what football does to poor kids (who don’t have very good choices but have more autonomy than the animals do). PETA estimates in the US we kill 9 billion chickens per year; given that 70-85 million people died in World War II, that would be a death toll of over 100 WW2s each year. For chickens alone. And the way factory farms treat these animals would be criminal if done to other animals.

What Does Not Watching Football Do Anyway?

Like many things in life (such as recycling more or going vegetarian), it’s not so much about the literal individual impact. It’s more about solidarity and signalling that things could be better if everyone chose to do <insert_good_thing_here>. To me, it’s showing that you think some cause is important enough to change your own habits over, rather than just insisting that other people have to do all of the changing.

To be honest, in March 2018 I was looking for something I could do to tell myself that I was doing “more”, and sacrificing football was that thing for me. And at the time, it was probably the right thing for me as I started my journey of trying things and seeing what felt right and what seemed to help others. It was a good process of grabbing on to what I could in areas like activism, protests, governance, research, and more. There’d be no way for me to get all of the things 100% right on the first try. But it’s been a useful journey.

In December 2018, I put my convictions to the test. I was offered ~$3,000 to consult on a sports analytics high school project. Should I turn down the offer because I swore off football or should I take the job and donate the money to a good cause like fighting malaria in developing countries or promoting diversity in STEM? I decided that my participation in the project didn’t hurt anyone & I agreed to join. Now I’m using the money to help start a local chapter of Girls Who Code at my old high school. And depending on how much that costs, I could still have a lot left over to donate to a good cause that would otherwise not have gotten that money. This exercise forced me to think about putting a dollar amount to the choices I made & try to assess whether it was worth it. In that case, it wasn’t. And in the case of social costs/benefits, I again don’t think it’s worth it.

But What If I’m Wrong? What If Football Really Is As Bad Or Worse?

That could be! As time has gone on, I’ve only gotten more confused & uncertain about what is or isn’t “right”. Some things seem pretty obviously correct, such as: fairness, dignity, and education. But a lot of things are hard. I’m viewing this decision as a course correction from maybe going too far in one direction last year, but either way I should probably reassess in a year or two again and see if I feel any differently.

I feel like I’m at a point where I’m on the fence about football & could end up on either side at the end of the day. I think I’m probably more uncomfortable with my Amazon Prime membership because of how much power it gives Amazon, how it eliminates competitors that can’t compete, how much control they have over us, and how relentless the company has been about efficiency.

I really don’t like the pressure I feel like I’ve put on myself to be someone who “doesn’t watch football for moral reasons” (not that anyone actually cares what I do, but we always feel like society is closely watching & scrutinizing us). And besides, there’s a spectrum of football participation; it’s not a binary. I can watch some games without buying merchandise or attending games. In fact, it’s actually pretty hard to try to 100% avoid following the sport: what do you do when it’s on at a bar or when your newsfeed is full of football posts from friends? For now, I’m going to start with the baby steps.

TA Reflection: S19 ML for Health

In Spring 2019, I (along with Irene Chen) TA’d the MIT’s first Machine Learning for Health class (see MIT News Article). You can visit the course website here to see the schedule/resources.

Exciting things about the class:

This was a very intensive class, requiring:

  • Reading responses before lectures 1-2 time per week
  • Scribing a lecture or staffing one of the nights for the Community Consulting event
  • 6 psets
    • Mortality Prediction on ICU Data
    • Diabetes Onset Prediction on Claims Data
    • Clinical NLP
    • Physiological Time Series
    • Causal Inference Intro / Theory
    • Causal Inference for Opioids on Claims Data
  • A class research project

I’m very proud of how everything went! Great course staff! Great students! Great invited speakers!

What am I most proud of?
The psets. I spent the largest amount of my time making the psets clear and interesting. There were some pretty stressful points as deadlines approached & IT infrastructure issues needed addressing. But I’m very happy with how the assignments came out!

What did I learn?
I remember during pset 1, there was a piazza question about whether the students needed to normalize their features before fitting a model. I replied very quickly to the question telling the students they didn’t have to worry about that. About 1-2 hours later, another student (not noticing my official answer) replied to the OP saying they probably should normalize features because of the feature analysis question. That student was right. But because I’d already posted an official answer & some people had already finished/submitted the assignment on that assumption, I couldn’t reverse course.

After that, I realized that even though I might feel a need to swoop in and help ASAP, I should move a little more cautiously & make sure I understand the implications of whatever decisions I make (because people will make their decisions based on those answers).

What do I wish we’d done better at?
As a class, we had 11 guest speakers, though only 4 were women (36%). I’m very happy with how our guest lectures went, but I suspect we could’ve had just as excellent discussions with 50% (or higher) female speakers.

How well did I do?
Student feedback for me was mostly positive (though I was the lower-scoring TA on the evaluations).

The written responses said my strengths were patience, kindness, responsiveness, and interest in the material. My experimental recitations were not universally well-received, which is good feedback to get.

Student responses of me overall. Only 14 responses because only 6.S897 students were able to fill out the Course 6 evaluation, Everyone enrolled as HST.956 could not write an evaluation.

What would I do differently a second time?
Multiple students expressed interest in a pset about Computer Vision for Healthcare (e.g. Chest X-Rays). This is an interesting idea, and I’d be open to exploring it. Of course, it would have some challenges w.r.t. processing power available to all of the students.

It might also be fun if there was a pset where each student needed to interview a doctor to learn about what they do & brainstorm about how ML would or wouldn’t help with various tasks.

In addition, I still think my recitations (e.g. observational data and challenges in evaluation) could be different from the lecture material but still interesting. I was hoping to use my recitations to discuss the broader context of healthcare (including regulations, deploying systems, and health policy). I suspect that there is still a way to integrate those lessons into the supplementary (optional) recitations to bring a different perspective for ML engineers in the class. Maybe I just needed to find a better set of areas to explore and/or ways to discuss that information. I tried assigning 3 podcasts/videos listen to & having a group discussion but no one listened to them!  🤣

Research: Racial Disparities in EOL

Just got back from a conference, and it’s always so interesting to experience the difference between reading a paper vs talking to its author & getting a more contextualized picture of how thoughtful or thorough the research is. As I thought about that more, I decided that it might be useful (or at least fun) if I wrote about some of my research & explained some of the context and findings. I assume it’d be more fun to read this than the papers themselves.

For my first research post, I wanted to start with the work I’m most proud of: my Master’s Thesis last year, which looked at the possible links between end-of-life treatment decisions and trust in the doctor-patient relationship.

Link to slides from my Research Qualifying Exam presentation.

In 2016, one of our clinical collaborators (Leo) pointed Marzyeh and me to a set of papers [1,2,3] which studied racial disparities in end-of-life (EOL) care in North America. They looked at “aggressive” care (i.e. high-risk interventions that are unpleasant, like a tube in the throat to try to prolong life) vs comfort-based case (i.e. hospice). The main finding was that white patients received smaller amounts of aggressive care than nonwhite (in particular African American and Hispanic) patients.

Renowned author and surgeon Atul Gawande has extensively studied end-of-life decisions and how patients who are empowered to make informed EOL decisions overwhelmingly choose to live with dignity instead of overly medicalized procedures (and that they do so with much higher levels of well-being and sometimes even longer lives). But these studies from Leo found that white patients seemed to be transitioning to hospice care earlier and at higher rates. Why?

Of course the first thing I thought of was implicit bias from the caregiver leading them to unknowingly make different care decisions. And while I don’t want to minimize that because it relates to the whole doctor-patient relationship, some of the researchers speculated an even sharper hypothesis for what could be causing this: mistrustful patients are ignoring their doctor’s recommendation for hospice.

For anyone unfamiliar with end-of-life care and decisions, it’s hard to appreciate the gravity of the situation without context. There are serious dignity questions at stake about how you or a loved one would like to go, and it’s especially hard to make that decision for someone else or when you’re really cared & confused. West Virginia Public Broadcasting produced a one-hour documentary about the struggles and process of end-of-life looks like. All I can do is write words on a page and tell you what other people say, but this video lets you actually hear from patients and family members experiencing this in their own words. If you want to actually appreciate what people go through, you need to hear from them directly.

Imagine your father is in the hospital. His doctor approaches you and says that after a week of critical care, she believes they’ve done everything they can for your dad. She suggests that it might be time to consider a transition to withdrawing treatments and making him comfortable for what little time he has left.

If you trust your doctor — if you believe that she’s looking out for your family’s best interests — you might strongly consider her advice. But if you don’t trust her — if you think the healthcare system doesn’t want to waste the resources on you or that they want your father’s bed for someone else — then you might say “Keep fighting and do what you can to save my dad.” Currently, this scenario has not been thoroughly studied to understand whether that’s definitively what’s happening to create the existing racial disparity, but that is how we believe the causal mechanism would be causing this. We wanted to study that further.

Previous work has suggested that racial disparities in health outcomes may reflect higher levels of mistrust for the healthcare system among black patients. Family members of African American patients are more likely to cite absent or problematic communication with physicians about EOL care. When the doctor-patient relationship lacks trust, patients may believe that limiting any intensive treatment is unjustly motivated, and demand higher levels of aggressive care. To better understand what could be causing this lowered trust and diminished communication, I read Harriet Washington’s 2007 book Medical Apartheid, and it was the most informative and contextualizing work that I read throughout this entire multi-year project.

American Medicine’s Dark History of Exploiting Black Bodies

In her 2007 book, Washington suggests that the medical exploitation of African Americans by white institutions throughout American history has created “Black Iatrophobia.” The most infamous example of medical exploitation on black Americans was the Tuskegee Syphilis Experiment from 1932-1972, where black men were tricked into not receiving any medicine for syphilis (despite the invention of Penicillin) so that scientists could study the progression of the disease. But while the might still be the most notorious example, it is far from the only example.

Medical abuse in America has plagued the African American community from the beginning of US history all the way through modern times. Going back to 1801, Thomas Jefferson injected 80 of his own slaves with smallpox to prototype vaccines. In the late 1840’s, Dr. James Marion Sims (considered by some to be “the Father of Gynecology”) surgically experimented on and mutilated his female slaves — who were unable to refuse his operations — without anesthesia. Until the early twentieth century, medical schools were using bodies graverobbed from black cemeteries as cadavers for their dissection trainings.

As recently as 1987-1991, US scientists administered as much as five hundred times the approved dosage of the experimental Edmonton-Zagreb vaccine against measles to African American and Hispanic babies in Los Angeles without communicating to the parents on informed consent forms that the vaccine was experimental or unlicensed

Harriet Washington’s thesis of Black Iatrophia has manifested in the published literature. Socialized mistrust of the medical community in minority groups has been established as a factor in care differences. With this broader understanding from this book, we wanted to further study this notion of “mistrust” and how it related to EOL care. The first thing we needed to do what try to establish what mistrust was and how to quantify it for the patients.

Modeling Mistrust Algorithmically

The first thing we did was replicate that the racial disparity existed in our two ICU datasets, which we found largely to be the case (especially for mechanical ventilation, though less extreme for vasopressors). But even after controlling for severity of illness, the differences in treatment across race still existed.

The next thing we wanted to do was to split the group into trustful and mistrustful to see whether that gap was even larger, but we did not have scores for each patient indicating the quality of trust in the doctor-patient relationship. We did, however, have some clues about trust for some patients.

An example of a patient’s clinical note documenting their explicit mistrust of their doctors and how that manifested in frustration and noncompliance.

Clinical notes written by doctors and nurses provide a very vivid and comprehensive view into the interaction between the patient and their caregivers. In the above example, we can see a patient who was very frustrated and mistrusting. The relationship with their providers was clearly poor, but not every patient’s notes is as easy to discern as this one’s.

An illustration of how we used Machine Learning to create mistrust scores for all patients by learning patterns about the few hundred “labeled” examples and extrapolating to all patients.

In order to derive a “mistrust score” for every single patient, we used the “clear” mistrusting notes as anchors and tried to determine how similar other patients were to those cases using a simple supervised Machine Learning algorithm (technically speaking, we derived three different mistrust scores through a process like this in order to avoid over-committing to any one definition of what “trust” is or how it definitively manifests).

Figure showing the mimiciii.chartevents table from the MIMIC database. This table includes many events, including all of these indicators above, documented by the caregiver staff.

The “inputs” for this model were a comprehensive set of documented interactions in the chart events table from the EHR. It captures a very large number of interpersonal interactions in the doctor-patient relationship, such as whether the patient’s pain is being managed well, whether the patient is restrained and treated as a threat, whether the patient got their hair washed, how agitated the patient appears, how frequently the care team is communicating with the family, and much more.

We looked at the most predictive features for the mistrust scores, and saw that high levels of mistrust were associated with agitation, pain, and patients being restrained. On the other hand, the most trustful patients tended to enjoy little-to-no pain, low agitation, higher levels of healthcare literacy, and more consistent family communication. 

Correlation matrix between severity scores and mistrust scores. A 1.0 means perfect agreement (hence 1s on the diagonal comparing a given score to itself).

After fitting these models, we wanted to sanity check them to ensure our algorithmic scores that we are calling “mistrust” seem to align with our intuitive notions of what that should mean. We wanted to ensure that the learned scores were not simply capturing some existing trend such as severity of illness: we saw that the scores had moderate correlation with one another (r=0.26), the severity of illness scores had strong correlation with one another (r=0.68), yet the mistrust scores had virtually no correlation with severity of illness (r<=0.05). While this experiment doesn’t prove that the scores are capturing “trust,” it does show that whatever it’s capturing is decidedly not  just how sick the patients are.

These plots are cumulative distributions of the percentage of patients given some duration of treatment, with the dotted vertical lines indicated the median of the population

To understand these plots, let’s look at an example. In the top left quadrant, we see that the median black patient received 3,286 minutes of mechanical ventilation whereas the median white patient only received 2,454 minutes. The difference is about 14 hours, on average. But when we split our population into trustful and mistrustful (instead of white and black), we see that treatment disparities are much larger across trust-based cohorts than race-based cohorts. This trend was true for 3/3 metrics for mechanical ventilation and 2/3 metrics for vasopressors.

Of course anyone who has experience in EOL care would have already been able to tell you how essential trust, respect, and communication are. The main concern for these racial disparities is that levels of mistrust are higher in one population than in another.

CDF of the 3 mistrust scores for white patients (blue) and black patients (orange). The scores were scaled to be zero mean and unit variance at the full cohort level.

We find that as anticipated, the majority of metrics indicated that the black population of patients had a statistically significantly higher level of mistrust. This (in conjunction with the above findings that mistrustful patients receive longer durations of care) reinforces our belief that mistrust could be the mechanism which is causing racial disparities in EOL. Although  these metrics are likely not perfect, they show as a starting point that there is a strong signal which needs to be better understood and potentially addressed. There are many ways to improve upon this work, and I think that further study is warranted.

Future Work

There are lots of potential follow-up questions one could look at, some of which are technical, some of which are social/ethical/etc, and some of which are both. I’m happy to chat about ideas/reactions that people have!

  • Patient surveys. Don’t speculate how the patients feel. Ask them!
    • Everything in this analysis was through the eyes of what the caregivers recorded in the notes / chart events.
    • Work with an interdisciplinary team (sociologists, caregivers, patients, etc) for better definitions of trust.
  • Doctor Biases. Do some doctors have larger racial disparities than others?
    • Do some doctors have higher levels of mistrust than others?Do some doctors have high mistrust among black patients but not for white patients?
  • Causal Inference. Was mistrust the cause of different treatment patterns?
  • Cultural: This analysis only focused on white and black patients.
    • Different cultures handle death & dying differently.We have an autopsy-based metric, but some people wouldn’t want the body cut open at all.
  • Policy: What should we do if we believe the status quo hurts nonwhite populations? What should change?
    • One could mandate standardized care. But people choose aggressive care. Is it too paternalistic to tell them otherwise?
    • Is the problem that the choices are not truly informed & that people don’t necessarily know what they’re signing up for? If so, how do we better help people make whatever choice they want in a truly informed way?

Overcoming Perfectionism

I started this blog in January as a way vehicle to encourage me to sit down, wrestle with my thoughts on an issue, and synthesize many voices into an interesting narrative. I only got two posts done before the difficulties of TA’ing and research piled up & I abandoned my “take Sundays off” routine.

I was too uncomfortable with the idea of posting something “half assed”. I wanted all of my posts to be clever, insightful, and well-researched. If they couldn’t be that, then I didn’t want them to exist. And that’s exactly what I got.

Personally, I think that’s a real shame! I’ve grown so much in the last 8 months, and I wish I could’ve shared that in a more explicit way. My new resolution is to try to stop being such a perfectionist. I hope that I can overcome my fears of unfinished work so that I could share the thoughts I have while they’re still developing so that I can get more input for how to understand certain issues.

I hope that in the future, I will write more posts that are shorter & less polished. I guess we’ll see!

Dracut DI is Back!

Today we had our Regional tournament for Dracut DI. Teams in the Merrimack Valley all gathered at Chelmsford High School to present their solutions to their various Destination ImagiNation challenges. Dracut had 3 teams:
– 2 Fine Arts teams and
– 1 Service Learning team.

All three of our teams were returning from last year (which was DI’s first year back in Dracut in years). And with a few new faces added to each team, every single one of them did even better than they did in 2018. It was so rewarding to see how hard everyone worked!

Tyler and Anne accepting the 2nd place ribbons on behalf of their team at Closing Ceremonies.

In an incredibly exciting end to the night, our Service Learning team (managed by Anne) placed 2nd at the tournament! This was Dracut’s first time on the stage at DI’s awards ceremony since the program returned. The last time Dracut placed at any DI tournament was 2013, so this is a really proud moment! We are so proud of James, Tyler, Marissa, and Julianne for their hard work!!

Sad soup cans at the Dracut Food Pantry hoping for volunteers to turn them into delicious food.

Anne’s team participated in the Service Learning challenge: they chose a community service project and then created/performed a skit to tell their project’s story. The team chose to work on assembling a database of volunteer opportunities in Dracut for other kids in their school to volunteer. By building this “volunteering database” and running a volunteer sign-up booth at recess, they were able to build a project that amplified their impact beyond what they could do themselves! In their skit, they showed the perspective of some lonely soup cans at the Dracut Food Pantry. The performance ended as student volunteers arrived to help transform the excited cans into food to feed the hungry. It was very wholesome!

Our two other teams worked on the Fine Arts challenge. They needed to choose a game to research, and then present a skit which has a cool game gizmo and multiple points of view.

Two points of view: those sucked inside the board game (left) and those playing (right).

Andrea’s team (Evan, Nolan, Serena, Julia, Logan, and Matthew) chose Monopoly. In their skit, Julia and Logan got sucked into the board. They tried to return home as they explored the world of giant dice, giant chance cards, and a not-so-friendly Mr. Monopoly. The team worked really hard on their technical element:  headlights (made out of water bottles) on Mr. Monopoly’s car. In the end, Julia sprung Logan out of jail and the two of them got back to their friends! Next time, they’ll probably just stick to Mario Party.

The pup-icorns join forces with the cat-icorns to save the kidnapped princess.

The second Fine Arts team from Dracut was Charlotte, Yulianna, Gretchen, Rosalie, Elise, and Vivienne (managed by Nicole). The team worked so hard on their props and scenery, and it really showed! They set up their many backdrops and presented their Webkinz-inspired story. The princess was kidnapped and tied to a tree, so the pup-icorns teamed up with the cat-icorns to work together to save the princess. I managed this team last year back when it was just Vivienne and Rosalie, and I was blown away by how far they’ve come with a year of experience and a few friends (as well as an incredible mom / team manager in Nicole). They did so well! And even more importantly, they had fun!

Vivienne, Rosalie, and me from our 2017-2018 team that I managed.

I am so proud of how hard everyone worked to make this happen. Thank you to everyone who donated so that we could afford to register for the tournament. Thank you to Dracut Public Schools (especially Principal Kimble from the Campbell) for being so supportive at every step of this process. Thank you to Alyssa McCallion for designing our awesome t-shirts for two years in a row now, and to Ann Morin for printing them so quickly! An ENORMOUS thank you to the team managers, who honestly had the hardest job of all. And also a thank you to my fellow Dracut DI Coordinator, Maggie Regan! I am so happy with how this year went, and I’m even more excited for DI 2020!

Dracut DI 2019

Why Should I Care About What “They” Have to Say?

This post discusses spoilers for:
Christmas Eve, 1914 by Charles Olivier
– Black Mirror S3E5 “Men Against Fire”

Our Shared Humanity

Last week, I noticed a deal for Audible members. It was offering a free download of the short story Christmas Eve, 1914. I decided to download it and check it out. It takes place on the trenches of World War 1. After listening to it I was so inspired.

A link to a behind-the-scenes look on the audiobook Christmas Eve 1914.

In the story, the British officers brace themselves as they receive orders from their commander to set up a machine gun to fend off a German attack on Christmas Day. Tensions run high as they take their positions, kill or be killed by German “men — or what used to be men — running at you and firing.” When four German soldiers enter No Man’s Land and slowly start heading toward their camp, the captain waffles under pressure and hesitates for a few seconds on giving the order to fire.

Then suddenly, one of the soldiers starts singing Silent Night and the Germans (who were holding little Christmas trees) join in. The German officers had hoped to negotiate a one hour ceasefire for the opportunity to collect and bury the hundreds of dead bodies lying on the battleground for weeks covered in mud and blood. After the British captain breaks down in tears over the men (and boys) he’d sent to die, the German captain suggests that “Gentlemen, maybe war takes a holiday today.” The British and German soldiers begin to talk and see one another as young, scared boys just like themselves. They share chocolate and cigarettes, and after realizing one of the German cooks used to live in England before the war, they start up a friendly game of soccer. The story ends with the British captain’s reflection about the empathy and bravery he learned that day. The story really did make me tear up. The British were seconds from gunning down those German soldiers out of fear for their own lives, but one of them saw the nakedness and vulnerability of what the Germans were doing, their better selves were able to win out for the day.

But there’s an unspoken sad part of this story. Although it was a truly beautiful moment to see the war pause for the day, the reality is that they are soldiers fighting in a war that they can’t end themselves. The next day, they will have to return to killing each other, and no amount of empathy or understanding can change that. And it’s likely that the generals would’ve been incredibly mad to hear their men with playing soccer with the enemy on the battlefield. To the generals, the war is a means to an end; they need to defeat the bad guys in order to preserve their own way of life. In war, empathy is counterproductive to the mission.

The Potential of Technology: Dehumanization on Steroids

Empathy makes war harder. In 1947, US Army Brigadier General S.L.A. Marshall argued in “Men Against Fire: The Problem of Battle Command” that the vast majority of soldiers on the battlefield in World War 2 never fired their weapon to kill. This wasn’t because they ran or hid – often they were willing to risk greater danger to rescue fellow soldiers or run messages – but because they simply couldn’t bring themselves to kill another human being. In response to suggestions by Marshall and others, the US Army instituted training changes such as de-sensitization and operant conditioning which brought the firing rate up to 55% in Korea and 90% in Vietnam. Indeed, the less we  empathize with “the bad guy”, the easier it is to kill them.

There was an episode of Black Mirror by the same name (Men Against Fire), which explicitly addressed this. In the episode, which is set in the future, a squad of soldiers are tasked with the mission of killing “Roaches,” which are creatures infested with genetic mutations. The soldiers use army-issued implant technology which helps them communicate handlessly, aim their weapons, and look at digital renderings of building layouts for tactical planning. When the protagonist’s implant starts glitching, he realizes that the “Roaches” he’s been killing look exactly like “real” people. Eventually he learns the truth: the Army designed the technology specifically to make the infected population look sub-human my modifying their physical appearance to anyone with the implant. Doing so made it easier to kill “Roaches” and reduce PTSD associated with murdering another human.

Of course, this show is just an allegory. To my knowledge, the US Army isn’t literally implanting tech into our soldier’s brains which edit their perceptions. But are there other, less obvious ways that technology current is dehumanizing each other?

“We 👏 Shouldn’t 👏  Be 👏  Nice 👏 To 👏 Murderous 👏 German 👏  Soldiers 👏”

Imagine if Twitter existed back in 1914. If a tweet went out “leaking” that the British and German soldiers were playing soccer with each other in No Man’s Land on Christmas, I’m pretty sure there would be immense public outrage about it. I suspect many people would be upset about how disrespectful it is for the soldiers to be chumming up with the murderous enemy that killed tens of thousands of British soldiers. Others would probably take it as evidence that either the war is hoax or at least that the soldiers are all in on the swamp of corruption and aren’t doing their jobs.

I can imagine that in 140 characters or less, it’s really easy to get people mad about the soldiers dilly-dallying and not taking their duty seriously, but it’s less easy to contextualize how rare and beautiful the moment was amid the gore of months/years of fighting on the front line. But that’s the thing about Twitter: very few people have the context but nearly everyone has an opinion. The problem is that Twitter isn’t designed to contextualize issues. It is designed to maximize engagement, which usually means inciting outrage. Outrage is a very effective way to keep users engaged.

A link to Debbie Chachra’s article about design dictating function on Twitter.

Retweeting also allows for what social-media researchers such as danah boyd and Alice Marwick refer to as “context collapse”: removing tweets from not only their temporal and geographic context, but also their original social and cultural milieu, which is very different from most public spaces. … While readers may literally know nothing about the poster or the context except for what is said in that one tweet, they can still just hit “reply” and their response will likely be seen by the poster.

… This amplification and context collapse, coupled with the ease of replying and of creating bots, makes targeted harassment trivially easy, particularly in an environment where users can both mostly live in their own ideological bubble by following people who share their views, however abhorrent, and who can easily forget that there is a real person behind the 140 characters of text.

You could imagine a world where Twitter allowed people to try out ideas they are working through in order to get feedback and then improve. And you can imagine that when someone sees a tweet they don’t understand, there’s some kind of “explainer” mechanism to help the user understand what is being conveyed. But that is not the world we live in.

Our current version of Twitter is full of people harassing each other and throwing lots of snark at everyone they think is stupid. And everyone immediately attributes bad faith to things they don’t understand. And in a lot of those cases, they are even right because everyone on Twitter is virtue signalling anyway. But for the cases where there really was a good faith disagreement, the snark, harassment, “public fight” mentality, and moral outrage is usually enough to kill any attempts to understand where the other person is coming from.

Are You Saying I Shouldn’t Condemn The Injustices In The World?

Of course, moral outrage is not in and of itself bad. If we see that in Georgia and North Dakota, Republican candidates are suppressing non-white voters for political gain, then there’s no generous interpretation which makes that okay. It’s racist and it needs to be stopped. If your reason to throw hundreds of children in cages is because you want to send a message to potential immigrants that they are not welcome here, the problem isn’t context, the problem is that you’ve dehumanized immigrants so much that throwing them in cages is “worth it” to accomplish your goals. Those kinds of things need a strong and sustained pushback by outraged people.

On the other hand, there are times when de-contextualized moral outrage might do more harm than good: a recent example of that would be the Fight Online Sex Trafficking Act (FOSTA). This episode of Reply All walks through on the motivation and likely impact of FOSTA.

A link to the Reply All episode that reports on the story and details of FOSTA.

The movement (supported by a powerful PSA)  aimed to stop websites from profiting off of pimps who kidnapped children and advertised the sex trafficking on sites like FOSTA (and it’s House counterpart SESTA) was passed to hold websites accountable instead of giving them a “safe harbor” from being forced to proactively remove illegal 3rd party content. However, most of the sex worker community said that this bill wouldn’t actually stop sex trafficking (because pimps have underground networks) but would hurt non-trafficked women (because they’d have to work on the dangerous street instead of being able to screen clients online first). In essence, they argued that the energy of the outrage was too blunt and it wasn’t directed at solving the right problems (e.g. by helping vulnerable women/girls early on so that they avoid being forced into sex trafficking).

Knowing which issues require action isn’t always obvious, and context is what helps inform whether the we are outraged over the “right” thing. We usually rely on our algorithmically-generated news feeds and timelines as a first step, but we should look into issues further. First and foremost, we should listen to the people who are being impacted by the problem or problem/action.

In addition, in my own experience I found it helpful to try giving the other side the benefit of the doubt when I first see something that seems crazy. Clearly someone supports this thing, but why/how? It’s usually pretty easy to tell if you spend 5-10 minutes trying to investigate why someone could support that. Sometimes it really is bad faith, but if you always assume that, you’ll be dehumanizing people you might otherwise be able to reach common ground on and maybe even persuade. No one thinks of themselves as the villain in their story, so sometimes it can be useful to understand where someone is coming from. Even if you completely disagree, it can help you understand your own values more clearly.

Why Do Doctors Hate EHRs?

My “Fresh” Perspective on an Old Complaint

A few months ago, I read a very thoughtful and detailed article by Atul Gawande about why doctors hate their computers.

A link to Atul Gawande’s article.

In this article, Dr. Gawande gave a lot of examples of how computer systems have hampered the doctor’s workflow. It began with an anecdote about the 3-year nightmare of using an EHR at his hospital: clunky interfaces and glitchy software, billions of dollars lost because of “the learning curve” of tech preventing them from seeing patients, redundant information overwhelming the caregivers, and increased demands on clinicians to enter “required fields” which were arguably not essential. The article then transitions to related discussions about other fields wrestling with tech, the socio-technological ways that computers restructured doctor-doctor interactions, the benefits EHRs could provide at their best, and how systems need both mutation and selection in order to improve.

To be honest, it was a lot to process, and it felt like 2-3 articles smashed together. But the through-line of the piece was this: the main problem with computers is that they have too many requirements on doctors which focus their attention toward screens and away from patients. The de-humanization of the clinical care is bad, he argued, because in healthcare, patients aren’t just sick; they’re also scared. He argued that when doctors used scribes to take notes and navigate the awful interfaces, both doctor and patient satisfaction seemed to improve. Essentially, I read his argument as saying “I’m not opposed to technology, but I am opposed to bad, inappropriately designed technology.”

After first reading Dr. Gawande’s article, I started to see a connection to other areas of my life where technology and society were colliding unceremoniously. Technology gives people the power to do new things that can be amazing, terrible, or both. It can scale up operations we’ve never been able to do before, enabling mass surveillance at the push of a button or perpetuating bias in algorithmic decision-making even more explicitly than unconscious human bias. And these technological “disruptions” usually happen faster than most other other aspects of society change, which means we usually don’t get the chance to sit down and rationally decide who will benefit, who will be harmed, and whether that is a good tradeoff for society.

In the case of EHRs, it seemed to me that the issue wasn’t necessarily that doctors didn’t like the interacting with bad systems, but rather that they were uncomfortable with how technology was forcing them to comply with the procedures that administrators, insurers, etc demanded of them. In the old days, the doctor was in charge of what they decided to write about a patient (despite passive aggressive reminders/trainings to remember to write “pneumosepsis” to be able to bill more than they would for just “pneumonia”). But now, technology has given power to other players, thus taking power away from the doctors as a result (now they must enter all of the information that the administrator wants to collect). I shared my thoughts on facebook to discuss it with my friends, and one of them suggested I should consider blogging some of these thoughts.

A link to my facebook post.

Sometimes I’m Wrong ¯\_(ツ)_/¯

Hopefully you only skimmed or skipped the facebook post, because it wasn’t particularly well-argued or cohesive. I wanted to do a better job for this first blog entry, so I decided to read more voices and get more opinions on the issue to make for a better discussion. I came across a fantastic (and rather short) book all about doctors and the growing pains of technology, “The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age” by Dr. Robert Wachter. This book changed my mind and convinced me that I was wrong.

A link to Robert Wachter talking about his book.

The Digital Doctor was written by an M.D. to explore healthcare at the dawn of the computer age. It tells the story of the current state of EHR systems, how we got here, the tangible harms this sometimes causes, and where we could hopefully end up one day with Health IT. The author’s argument is that although computers have for sure made healthcare safer, there are still a lot of problems that need to be addressed and many of the people in power are too reluctant to touch the issues in the systems when they arise (sometimes because of irrational risk aversion, sometimes because government policies are misguided or coercive, sometimes because any criticism of Health IT can get you labeled as a technophobe and a luddite).

The entire first part of the book is dedicated to the current version of Health IT, the context it was created, the problems it mostly solves, and the failures that arise specifically because of the interventions to get computers adopted by hospitals. One example is the medical note, which used to be how doctors communicated the patient’s narrative of care from one clinician to the next, but policies which mandate that notes be comprehensive for legal and billing purposes incentivized doctors to copy and paste pages of notes every time they see a patient because they won’t get paid as much if the note is too short. At best this leads to irrelevant parts of the patient’s state being recorded “just in case,” and at worst it harms care by copy-pasting old information about a patient such as their abnormal heart rate from weeks ago. One particularly funny anecdote that Dr. Wachter cites is an instance of one patient whose daily notes all mention taking his temperature in his foot and getting a reading of 98.5 degrees F, which wouldn’t necessarily be weird if that patient’s legs hadn’t been amputated earlier that month.

After reading The Digital Doctor, I realized I’d been far too dismissive of the usual gripes about the design of EHRs. My facebook post had only briefly acknowledged clunky design before quickly moving on to what I thought was the “real” issue:

“So everyone gets mad at the EMR (which is admittedly very clunky and tbh worse designed than it needs to be) when in reality, the EMR is just the medium for competing values and interests. The tools make these conflicts more direct, and it makes the balancing act a lot harder by giving everyone more power to fight with one another.”

Dr. Wachter persuaded me that I wasn’t focusing on the right problem. While it’s important for people outside the system to think for themselves with a fresh perspective, it’s also important to take seriously those affected by an issue when they tell you what they think the problems are. Sometimes they’re right, sometimes they’re wrong, usually it’s a little bit of both. In my case, I hadn’t appreciated the impact a poor design can have on how caregivers practice. Poor design leads to alarm fatigue: because given machine that a patient is hooked up to might beep or raise warnings 150+ times per day, the nurses learn to ignore the beeps (because the machine often treats “these two drugs have been occasionally shown to interact poorly once or twice” the same as “you just ordered a 38x overdose and will poison the patient”). Poor design leads to interruptions and distractions: unlike EHRs, airplanes have the “sterile cockpit” when flying in critical junctures such as under 10,000 feet. The FAA recognizes that pilots need to devote their undivided attention to flying (rather than documenting takeoff time or fuel levels) because otherwise their performance will degrade and people will die. These design principles have not been applied to doctors working in the ICU.

The Future of Health IT

Technology should be improving care, and if the current system isn’t just inconvenient but also harmful, then the design is more than just “admittedly very clunky.” But we’re still just at the beginning of the information age. Experts agree that machines will not be intended to replace doctors but rather to change the landscape of care. Sometimes this is good (e.g. telemedicine which could eliminate many unnecessary trips to the doctor’s office) and sometimes this leads to losers (e.g. radiology has been decontextualized from the patient’s care ecosystem, sometimes even offloaded to radiologists in India or maybe one day image-to-text computer algorithms).

As I thought more about how technology could change the landscape of care, I was reminded of an old episode of my favorite podcast The Weeds from May 2016. In the first 33 minutes of this episode, they discuss The Productivity Paradox and Robert Solow’s 1987 quip “You can see the computer age everywhere but in the productivity statistics.” They spend most of their time discussing Health IT specifically.

It’s like how the advent of electricity initially only led to a slight increase in productivity until engineers realized that they shouldn’t just put electric motors where the steam engines used to be, but should rather redesign their factories with these new small motors in a more efficient way that big clunky steam-powered tech could let them do. Even seeming success stories in IT are not yet showing the gains we might expect, such as how undoubtedly search engines and Google Maps give me access to information at levels previous generations have never been able to do before, but we still are not seeing growth as wildly rapid as we saw in the 1940s and 1950s. We seem to still only tinkering with our metaphorical engine placement rather than redesigning the whole factory. Like Henry Ford said, “If I’d asked people what they wanted, they would have said, ‘faster horses.'”

A link to an episode of the Weeds where Ezra, Sarah, and Matt discuss technology and innovation in healthcare.

One thing I will note, however, is that The Digital Doctor and The Weeds both reached very similar conclusions about the promise of technology and innovation for healthcare, and they saw similar regulatory and societal issues that need to be solved before we can really unlock that potential. Perhaps that means these sources have identified the “little bit of column A, little bit of column B” for how to solve most problems, or perhaps they share biases/assumptions. If anyone has any reading/listening recommendations for other wildly different interpretations of what we should do for the future of Health IT, I’d love to hear them! It’s always a little concerning when everyone agrees because that usually means something is being overlooked or undersold.

Now What Do I Do?

When a psychiatric patient is discharged from the hospital, their caregivers write a discharge summary about their course of treatment and why the believe the patient is stable enough to leave. Unfortunately, this assessment isn’t always correct. Some studies have found that 40-50% of patients discharged with depression and schizophrenia are readmitted within a year or less. Readmissions (especially quick ones like 30-day readmission) are bad for patient care and bad for the hospital. I’m working on a research project to try to use Machine Learning to improve risk assessments for whether a patient is low enough risk to go home based on their discharge summary.

Reading through notes is both physically and mentally exhausting for humans. It’s distressing to spend hours reading about patients suffering from psychiatric disorders with difficult lives involving homelessness, suicide attempts, alcoholism, and more. And no note is “easy”; a single note might mention both beneficial and harmful indicators, such as: a loving family, a history of substance abuse, a stable job, and an abusive relationship. Not to mention, the patient is being discharged because their caregivers do think the patient is healthy enough to leave. Based on preliminary results, it usually takes 3-6 minutes to read even a single note in order for a human to decide readmission risk one way or another.

Despite many of the successes of AI, some tasks are still hard, and natural language processing is still (for the most part) one of those tasks. Based on preliminary (not-yet-published) experiments, humans are still able to outperform the models we have tried so far.

But this situation seems like the perfect opportunity for an AND rather than an OR. If the goal is to identify high-risk patients, then it doesn’t need to be purely computer or purely human. I want to explore whether machine-generated explanations can help humans make better-informed risk assessments. The experiments are still in their early stage, but I’m especially interested in questions such as:

– Would human performance improve if shown ML-derived risk scores?

– Can we decrease the amount of time it takes to read a note while maintaining (or improving) human performance?

I can’t say for sure whether I’ll continue working on questions specifically like this, but I am absolutely fascinated by how technology changes the way we interact with our jobs and with one another. I’d love to hear any more suggestions for what to read. Perhaps I’ll realize once again that I’m focusing on the wrong thing!