Faculty at Northeastern in Seattle Develop AI Detection Tools

Is using AI for academic work cheating? The debate rages hotly, but Shanu Sushmita has a more useful question: what is a student using AI to do?
To Sushmita, a Khoury College assistant teaching professor with two decades of experience in artificial intelligence, a student going to ChatGPT to generate a paper wholesale is very different from a student using all the tools at their disposal to organize their notes, check their grammar, or proofread work in a second language. Teachers need tools that can tell the two cases apart. That’s why she’s developed an AI detector that can not only detect generated text but determine its context.
“I heard a lot of students’ college application essays that were flagged for plagiarism were not actually plagiarized; that really broke my heart. How can we create a more mature AI detector?” Sushmita said, pointing out how existing AI detector tools were biased against non-native English speakers. “It’s not black and white. Our goal is to have AI for good.”
To Sushmita, the key to making our interactions with AI safer and more useful is to account for context, nuance, and intent. Whether it’s the broad variety of projects in progress at her Generative AI Research Lab, or her work to bring more women into the industry, she brings an eye for the grey area to everything she does.
Sushmita began studying artificial intelligence, machine learning, and natural language processing twenty years ago, drawn to the field by her love of mathematics. She felt lucky her father supported her interests, but the wider field of computer science was only beginning to open its doors to women at the beginning of the 21st century. She wasn’t deterred by being the only woman in her PhD program, instead diving headfirst into user interactions with search engines, social media, and predictive healthcare analytics. She even spent a year leading research for the healthcare analytics software company KenSci, before realizing her heart lay in academia.
When Sushmita joined Northeastern in Seattle in 2021, she pitched the Generative AI Research Lab, a space to tackle complex questions in the quickly emerging field. At the time, she didn’t even know if the questions she was asking – like whether it was possible to detect what AI was being used for – could be answered, so she was surprised and delighted by how quickly Northeastern agreed to support her initiative.
“I was really overwhelmed with the kind of support that I got, and I continue to get,” she said. “I just have to say that I’m interested in doing something, and somehow, Northeastern finds a way to make that happen. I feel very fortunate.”
With the Generative AI Research Lab, Sushmita began to study academic AI detection in 2023. Initially, she found the field suspicious of whether AI-generated text could be detected in the first place, much less whether she’d be able to tell use cases apart.
“It turned out it was actually pretty easy; we achieved so much accuracy that it freaked us out. We went back and spent another month just debugging, making sure our code didn’t have any errors,” Sushmita said. Her initial work – separating AI generated content from work written by a human – achieved 99% accuracy; from there, she and her graduate student Rui Min developed another tool to separate AI generated work from work paraphrased by AI, which is currently achieving 93% accuracy.
“How do you tell something that has the knowledge of the world to not answer a question, not fall into a trap? One way is to identify the traps it can fall into in advance”
Sushmita and the Generative AI Lab are tackling other projects, too. They are currently crafting a model to detect objectionable content in music, to help parents to monitor their childrens’ playlists. Instead of detecting profane words, their model identifies subtler themes of violence, sexuality, and substance abuse.
Her team also works to “jailbreak” LLMs, finding prompts that disclose dangerous information in a process similar to white hat hacking. The goal is to discover and flag queries with ill intent before bad actors think to attempt them. For example, commercial LLMs will refuse to answer a query about how to build a bomb but might not recognize the query for what it is if it’s phrased as a math equation or a hypothetical – and Sushmita wants her team to be the ones to find that out.
“How do you tell something that has the knowledge of the world to not answer a question, not fall into a trap? One way is to identify the traps it can fall into in advance,” she explained. Jailbreaking LLMs is critical “to ensure that we continuously make these large language models that are accessible to people more and more safe.”
To keep such a diversity of projects moving forward, Sushmita relies on her graduate students, who are “amazing, talented, and really, really shining” people. She works with research assistants, apprentices, and capstone project students, and she strongly encourages students to stay open to the many possibilities the field has to offer. But students don’t have to be at the graduate level to get involved.
“Northeastern and I invited some high school girls to come and attend the Seattle campus’s Remarkable Women in AI panel this past spring, because I wanted them to see what women in the space looked like. The girls got to speak to the VP of Amazon in AI; that’s great exposure,” said Sushmita, who serves as the Washington State ambassador for the Women in AI nonprofit organization. “I’m a mom of two girls, so I want them to see a normalized and balanced world.”
As she looks forward, Sushmita is excited to continue to refine her existing detection models, and to tackle emerging questions such as whether generative AI tools can provide tailored, on-demand tutoring support. Wherever she applies herself, she wants to bring a spirit of open-mindedness, inclusiveness, and nuance to her work.
“My mission is to build things that impact people’s lives in a good way,” she said. “At my lab, we focus on choosing the problems we deeply care about in the long term.”
By: Madelaine Millar