Episode Description
Following on from our last episode about ChatGPT, today we’re tackling the technical side of ChatGPT. ChatGPT is a generative text tool created by OpenAI that’s been causing a stir amongst educators and students alike. To discuss this topic further, we welcomed Annie Chechitelli, Chief Product Officer at Turnitin and Linda Feng, Vice President of Architecture at D2L. Our guests and Dr. Ford chatted about:
- The technology behind how ChatGPT and generative AI tools work
- How academic integrity comes into play with respect to ChatGPT
- Decoding the hype around AI
- Why we must address the ethics and inherent bias in generative text tools
- How ChatGPT could benefit or hinder progress for students and educators
Show Notes
01:44: An introduction to Linda Feng and Annie Chechitelli
05:27: Linda explains how ChatGPT is trained and why the need to fact check still exists
11:30: Annie breaks down the background of ChatGPT and shares a bit about Turnitin’s AI innovation lab
18:03: Turninit’s current plans for flagging and surfacing AI-generated content in classrooms
27:17: Linda’s thoughts on the impacts of ChatGPT in higher education and K-12 moving forward
30:15: Annie discusses how ChatGPT can positively impact students
33:23: Cristi asks what some potential applications of ChatGPT in specific domain areas could be and how soon we can expect to see them
Full Transcript
Dr. Cristi Ford (00:00):
Welcome to Teach & Learn: a podcast for curious educators brought to you by D2L. I’m your host, Dr. Cristi Ford, VP of Academic Affairs at D2L. Every two weeks I get candid with some of the sharpest minds in the K-20 space. We break down trending educational topics, discuss teaching strategies, and have frank conversations about the issues plaguing our schools and higher education institutions today. Whether it’s EdTech, personalized learning, virtual classrooms, or diversity and inclusion, we’re going to cover it all. Sharpen your pencils class is about to begin.
Welcome back educators. We had such a rich conversation around ChatGPT on our last episode that we decided to tackle this topic from another angle. I want to thank Dr. Aumann and Dr. Bouchey for helping us to examine the academic side. So if you missed that episode, you can take a listen to it on Apple, Spotify or wherever you listen to your podcast. But today we’re in for a real treat in being able to discuss the technical implications of AI informed practices that impact our work. If you’re like me in online education, many of us have found our work at the nexus of technology and pedagogy for many, many years and so what could be better than learning from our two fantastic guests we have today? I’m so excited to have our guests that will be joining us.
And we really think that this episode will shed some light on the technical side of ChatGPT in that conversation, and really help us to think as educators about the balance and look at the pros and cons of the implications of just more generally generative AI tools.
So before we jump in, let me just take a moment to introduce both of our guests. I’m really, really excited to have both of them joining us today. I’ll start with Linda Feng.
Linda Feng is currently the vice president of architecture at D2L, and Linda comes to us at D2L from Unicon where she focused on providing strategic consulting to leading institutions, major publishers and EdTech companies related to integrations and learning analytics.
Most recently she helped several educational institutions, design and build an analytics pipeline where she was able to use tools like AWS and Google Cloud platform. And Linda was formally a senior product manager with instruction and has over 20 years of experience in database server and applications product development at Oracle. Linda, you’re going to have to break that all down for me a little bit later as an educator on the call here. She’s also published many articles and spoken at numerous education conferences, shout out to EDUCAUSE and ELI and IMS Global which is now one EdTech, and she’s really been very productive and thoughtful about where this work is moving. And so I was really glad to have you Linda join us today. So thanks for being on the call.
Linda Feng (03:00):
Thanks, I’m super excited to be here.
Dr. Cristi Ford (03:02):
All right, so we’re going to round this conversation out with another really great voice in the space, and really honored Annie that you could join us today. Annie Chechitelli is a chief product officer at Turnitin. As a CPO at Turnitin, Annie oversees the Turnitin suite of applications which include academic integrity, grading and feedback and assessment capabilities. And Annie, I have spent many years as a proponent and user of Turnitin, and so it’s really great to see that this is in your portfolio. Prior to joining Turnitin, Annie spent over five years at Amazon where she let Kindle content for school work and government, and launched the AWS EdTech growth advisory team.
This team advised educational technology companies on how to grow their product and go to market strategies with AWS. Prior to that work, Annie spent most of her time working in her EdTech startup part of her career with Wimba, where she launched a live collaboration platform for educators, which was ultimately acquired by Blackboard in 2010. At Blackboard, she led platform management focused on transitioning Blackboard Learn to cloud. Annie holds a BS from Columbia University and an MBA (MS) from Claremont Graduate University and she lives in Seattle. Annie, thank you so much for joining us as well today.
Annie Chechitelli (04:23):
I’m super excited to be here. And we should mention that Linda and I are also friends, so that makes this fun for us too.
Dr. Cristi Ford (04:30):
Absolutely.
Linda Feng (04:31):
Yes, and we have known each other through many of those great conversations at all of those different conferences that we’ve both gone to.
Dr. Cristi Ford (04:40):
Excellent. So I think that this will be a really rich conversation. And as we start, I’m going to start with you Linda. I wanted to kick off the episode, I talked a little bit in our previous episode about ChatGPT for our listeners who have no understanding from the technical side, but I would appreciate more info from you. It’s really incredible what I’m starting to see and hear from educators about what it can do, but what I’m understanding is it’s not just an intelligent agent or alive. Can you explain to us and the educators listening how this tool is trained and why does it not have some of the most up-to-date information and why humans still need to check, to fact check what’s happening in the system?
Linda Feng (05:27):
Yeah, absolutely. I think this is important for us to kind of get started and really level set on that. I guess before I start I just wanted to also mention, so as you said at the intro so I am new to D2L, I just joined within the last month so a lot of my perspective about this area is coming from work I’ve done prior to D2L as an EdTech architect working in the field with communities like EDUCAUSE. But since I’ve joined D2L about a month ago, I have really jumped into the research that’s being done by our engineering teams here. And so a lot of what I’m going to highlight today is really aided by some of the analysis that has been provided by our principal engineers.
And so I’ve been following along with this really great breakthrough with ChatGPT and I love seeing the discourse that’s out there where our whole industry is really trying to figure out what this means for our sector. But I think it’s important to note that the field of AI has been developing for some would say like 40 to 50 years. And the consensus that is out there now is that what ChatGPT and other generative AI technologies like DALL-E and others has brought us now at this moment is a more friendly interface to average human users. So things like ChatGPT are really underneath the covers going against a large language model that’s trained on a giant corpus of data. I think it’s like 175 billion data points, and anything that it’s doing is really just a reflection of the inputs that it’s been given.
It’s sort of like what you get if you had a really good smart assistant person who puts stuff together for you using Google searches. And the issue I think that we have discovered is that it sounds amazing what it comes up with, but really all it’s doing is it’s being really good about predicting a set of words that needs to come out in response to a prompt. So this can result in things that people are calling hallucinations so it can come back with a response that sounds plausible, but it’s factually incorrect. So I think that’s where I think we do have a responsibility around the tools that have been built on the AI that has now been in development and emerging in this way to be responsible about how we see it being put to use, especially in our field.
Dr. Cristi Ford (08:05):
Really good framing Linda, and I appreciate and shout out to us at D2L for getting you to be a part of the partnership. It’s been great to get to know you in this last month as you’ve joined.
I heard a couple of things really clearly, one that this isn’t new, right? That AI generated tools have been around for decades. Two, that this is also very synonymous to having a really good person who could do a lot of Google searching. Can you help educators understand where the factual breakdown happened? Number one. And number two, I’ve also heard from some educators that it’s really good at providing data on current events that happened before the pandemic.
Linda Feng (08:52):
Right. So that has to do with the language model that it was trained on. And so we know that that future iterations of this will get better, and my understanding is that there are certain tools I think that Microsoft version that that’s Bing used for being apparently doesn’t have that limitation. So while I think what we’re using today and what people are interacting with in the form of ChatGPT today does have that limitation, there are other tools that are out there that can produce information that’s more current. So I think that’s a, I will call it a temporary limitation. I think we have gotten to a point, I don’t know what’s the right analogy, but if there’s enough people out there, it’s like the law of averages I guess.
If enough people are out there saying one thing, you can see the pattern of that’s probably what people would want or conceive of as the true answer. But this can also lead to if there’s enough of the wrong answer out there, then that could be equally chosen and so that’s I think where we have some of that risk. One of the things people talk about too is that the algorithm is what’s called stochastic, not deterministic. There are settings that you can apply to help level this out, but in order to make the responses appear more natural you need a little bit of that variability. But what that means is that if you ask or prompt the same thing twice, you may not necessarily get the same answer.
Dr. Cristi Ford (10:51):
That’s good. That’s good. That’s really helpful Linda, and you providing that. And when you talked about enough of people saying the same thing, kind of crowdsourcing even maybe misinformation, it takes me back to my Wikipedia days. Annie, thinking about something you wrote in January in a Turnitin blog where you talked about the AI innovation lab and how you’re using technology to flag plagiarism. I just wonder as we’re having this conversation, can you give us some insights into how you see this technology working and perhaps some ideas about where you’re going with innovation lab?
Annie Chechitelli (11:30):
Yeah, sounds great. And I wanted to start kind of the same place Linda did around some background. It’s helpful to know in terms of where Turnitin is in the evolution. So ChatGPT was released on November 30th, 2022, but the generative writing, generative AI writing, not new. So GPT3 which is the main part of the model, I mean ChatGPT is based on GPT3.5, but GPT3 is pretty close and that was released in June, 2020.
So the actual model that is used was released almost three years ago now. And if you were part of the AI community then, and even now you can go back, it was well discussed, the power of it, what it could do. And at that point in time OpenAI released it in an API waiting for other third parties to develop the interface that would be useful for whatever market it was.
So before ChatGPT, I was spending a lot of time with Jasper AI. That’s one that was really good and really focused on marketing content, just getting a lot of marketing content out fast. And so I was even in our team, we were already playing with the models both from a technical detection perspective but also from an understanding the power as well as the limitations prior to ChatGPT. But then at that point OpenAI, and if you listen to Sam Altman in some recent interviews, he’ll say they didn’t really want to do it. They were waiting for other third party to do it, but they’re like, well, if no one else can build an app that is super easy for users to show the power of this and to get people using it to get data, I guess we’ll do it.
So ChatGPT is really just a fantastic example of simplistic and very high quality UX. They just knew exactly who the user was and what they needed and how to simplify it. And so that then reaches out to the GPT3.5 model to pull the information in a way that is useful. So we here at Turnitin had been working on the model since June, 2020 to understand the statistical patterns like what is it doing? What are the limitations and how does it work? And as Linda mentioned, it’s trained on the entire internet, how many billions of documents or webpages used on the internet. The technology is not writing for meaning, and this is really important and it sounds obvious, but some people miss it. It’s not like it’s trying to understand that question and then think what’s the best answer and come up with the full answer.
It goes word by word similar to what you’re used to and if you use Google meionics, like completes the sentence. It’s like that, but it like that, but a major improvement. So it uses the whole internet and it looks at the context of the text that’s come before, it looks at the prompt itself so all the information that’s coming in, and it uses all that to predict the next highest probable word in that sentence. And then it goes to the next highest probable word. And all of these are statistically insignificant because it’s always choosing the next highest probable word based on what it has. It’s not actually answering a questions in a way that you and I would answer the question, and that comes back to the how could it be so factually wrong? Because it’s not really answering a question, it’s answering for the next word.
And so, at Turnitin, what we’re doing and what we’re working on, if you go to our blog you can see some of our early lab detection that what we do is we go word by word and we can understand that humans don’t write that way. Just like there’s no perfectly average human, by the way there’s like this is average, but nobody’s average, right? Nobody is the perfect height, the perfect weight, the perfect… It doesn’t exist, and the same is true with writing. So you’re able to see especially, and this is why if you look at other, not just our detectors, but other detectors, especially when the more words you have, the higher confidence you have, and seeing that consistent probabilistic word choice is a result of AI writing because humans don’t write that way and so we’re able to apply that. It’s a very simplistic way of all of the math and science that we do to say that something we with high confidence believe that it was written by GPT3.5 or ChatGPT in this instance.
But also in the future, whatever those large language models are that become accessible to students as well. So I’ll stop there and then we can go into if you want to hear more about our specific plans, but that’s kind of our frame of how we think about the writing and the technical component and how we detect it.
Dr. Cristi Ford (16:43):
So Annie, that was really in informative for me to hear one that Turnitin and thinking about the innovation lab work, that this is not new research. That even though this ChatGPT was introduced in November 30th of last year, that this technology and some of the origins of ChatGPT have been around for a couple of years now and that your work has really been involving that. And to hear OpenAI was really waiting for another entity to take it on, and then just said, okay, it’s been three years, no one’s doing this. We’re going to take up this space. Really fascinating to hear that, and also when I equated when you talked about the spaces around plagiarism, it’s almost like we’re really writing snowflakes, right?
And so there is no perfect example of the ways that people write that are going to be identical. And so you sharing with us as educators that when the question is asked of ChatGPT there is not comprehension actually happening, right? They’re not looking to answer the question, it’s the statistical analysis of the prediction of the very next word. So from that I’d love to hear what you’re finding and where you’re going at Turnitin to really be able to help institutions and to really be able to make a dent in some of this.
Annie Chechitelli (18:03):
So we’ve been an academic integrity company for over 20 years so we see this as both an opportunity, and we’ll talk about that I’m sure later about all the different learning opportunities we have with the power of this technology, but also there are opportunity for misconduct and for not using it appropriately or just using it as a crutch in a way that teachers do not want or they haven’t sanctioned. There are plenty of opportunities for teachers to say, I sanction this and that’s great and those are very different assignments, but what we’re hearing from the community is even if it’s sanctioned teachers want to know. They want insight into was there an AI assistant used to generate this document and perhaps how much.
Is it just a little bit to get started or is it all the way through? Is the whole thing just written by ChatGPT? Which it’s different motivation we would say than just maybe they got started with ChatGPT and then they made it their own. Those are different and then the teacher can decide what that means for them. And so we’ve been working on this and we will be… Within, we have what’s called a similarity report for those who use Turnitin, where the teacher’s able to see here are the places in the text, in the document where we think it matches something else. It could just be poor citation, it doesn’t have to be intent, right? It’s not always cheating, but we point those things out for teachers to go then investigate. We’ll be adding to the similarity report, an indicator that says a percentage of the document that we deem with very high confidence to be written by ChatGPT or another tool that uses GPT3 or 3.5.
And then they click into that to learn more and they’ll see a report. And so you can all go to our website and you’ll see a prototype of this. It’s not the prettiest UI, but it’s actually quite close. And so you’ll see sentence by sentence where we believe that something was written by ChatGPT, and let the teacher make the determination or how to have the conversation with the student. Maybe it’s asking more questions, maybe it’s quizzing them or maybe it’s just having a basic conversation. So we leave that to the teacher. But what’s different here is in the similarity report, and even in other things that we do, there’s always a source document. And so it’s been comforting for teachers to be like, oh, okay, I can see the Wikipedia site and I see what they wrote and I have more of that confidence. Whereas in this situation, there’s no source document and so we’re really working, our goal from Turnitin is to be very transparent about what we’re doing.
And we’ve done that from the past with our blog, as well as with all of the information we’re going to be learning. So our plan is to release this as early as April. And we’re releasing it in preview, and the reason we’re releasing it in preview meaning we’re not saying it’s full perfect production. How could it be? How could it be at this point in the innovation cycle we’re in and how fast and rapidly it’s changing? So we believe it’s better to give them the tool, they can decide to use it or not use it. And then we’ll get the data and we’ll start to learn, and they will start to learn as teachers as well. And there’s actually a really good article today I think was say on Vice about how even just you can read and start to learn the pattern of it’s very fluffy in a certain way, but it’s the best type of so well written, but saying nothing is kind of one of the indicators of right now.
And so teachers are getting a better sense of it as they’re using it. And the same with this tool, we want to put it out there to get that feedback and learn. And even we know we’re trying our best to get the messaging right even just how you use it and interpret it, I know we’re going to be wrong. That’s the one thing I know for certain it, some of it will be wrong. Not the detection, but how we explain it to a user. It’s a very complicated notion. So the other piece that’s important to us as we were looking at our own algorithm, where are the risks? We believe the biggest risk in AI writing detection is with false positives. No one wants to, and ‘accuse’ is too strong a word, but we don’t want to cause a situation where a student is questioned and then they lose motivation and engagement.
We want to limit that as much as possible. So we use two measures, I’ll talk about this, two metrics that we use in looking at our efficacy. The first is called precision. So, precision is our confidence that there is AI writing. Recall means that the detector is able to… That it’s only picking up on the AI writing. So think of it as a false positive, but not exactly a false positive. So we have a 99% precision rate and a 97% recall rate. We would rather let something go by, so we’d rather let maybe some AI writing go by that we don’t say is AI writing than say it is AI writing. And so that’s how we’ve tuned it. And we’ll continue to update those numbers.
Those numbers are lab numbers, and that’s also important to note. We’re being very transparent. What you’re going to get in the wild and what we’re going to learn in the wild is not the same as the lab, and we’re checking it now. We’re able to run it and see what we’re getting, and so we’re making sure that we are transparent about those numbers with the community. What we’ve heard from the community, and this is just us, that they’re only confident in using a tool that’s 75% accurate. And then we ask them, what do you mean by accurate? Do you mean that it catches everything or that there’s a false positive? And it’s a very complicated set of numbers and so they tend to say both. So we’ll put more of that on our site as it’s developing, but I can guarantee you it will develop as we actually put it into production.
Dr. Cristi Ford (24:55):
And even on something that’s so critical when you talk about the reliability of those reports as an educator myself, I really appreciated when you talked about the importance of making sure that you’re not creating additional challenges or barriers or detractors for students to continue to be engaged. When I talked with faculty who are working with individuals for whom English is not their first language, or individuals who may have an auditory processing delay, they talk about ChatGPT being a good opportunity for students to be able to get their first thinking down and utilizing the system to really be able to hone those thoughts and those opportunities. And so we’re all learning around this trajectory and this growth curve.
And it is nice to know that one, there’s a lot of transparency that you’re offering around the work that you’re building. Two, that you’re listening and talking with educators to make sure there’s the value add that is needed and that there’s some comfort level as we grow. And so I think about how much frenzy there is now around 3.5, ChatGPT 3.5, you’ve talked about it, but I know that there have been a lot of conversations about this version being replaced. And so we’re hearing a lot of OpenAI talking about the promise of performance leaps and that there’s going to be improved text generation and there’s going to be a huge impact on businesses. You talked about Jasper earlier in marketing, and so I wonder, as you think about where we’re going to move as we see this next iteration of the platform.
What are your thoughts on the impacts of higher education, both good and bad, that educators just need to be preparing themselves for and really thinking about as they start to think about teaching over the summer, the fall? What kinds of things may… Linda, you already gave me a sense of the system was programmed in a space and time where it only had knowledge for the language learning system up to a certain point in time, but now it may have more current information about current events, but what are either of your thoughts around the impacts of higher education and K-12 moving forward?
Linda Feng (27:17):
I’ll start and Annie, you can fill in. So I think my understanding of where version four is going is that it’s more of the same. It’s going to be still text only. One thing that people are talking about is that it might be better at generating code, which I think has been something a lot of people I know that I work with have been playing around with that. And so there’s some interesting stuff that I think we could look at in terms of engineering productivity, and really helping to make sure that it’s like an assistant behind the scenes for developers. I think in terms of impact to our systems about education, I think it really is just more of that reinforcement that we need to assume it’s already there. And I really appreciated the educators that you talked to Christie last time ’cause they brought up some great points just to help people understand that you could ban it, you could try to go to oral and make people write essays in the classroom, but none of those are really going to work.
You really have to embrace just how are you going to actually have your classroom assume that the assistants are already there. Another thing I’ll say is that in the realm of these kinds of AI, there’s a researcher from MIT who has… I’m sorry, he might be from Princeton and has presented at MIT, but he talks about the three types of AI. And so there’s the perception type where that’s really a lot of what we’ve been talking about here, the face recognition, speech to text, those kinds of things. Then there’s the realm of automating judgment which is like spam detection. I think things like automated essay grading and content moderation might be in that category. And then the last one is around predicting social outcomes, like job performance, at risk kids. Those I think we are not yet confident in as a community that those are really usable and that we have good technology for that.
But I can see that over time we may start to move up that ladder, and so that’s the area that I think the current progression we’ve had in the perceptive AI technologies is a good wake up call and opportunity for us all to come together and really talk about what we want out of this and make sure that we’re responsible about its progression.
Annie Chechitelli (30:15):
I love, and there’s so many good aspects that have come out as a result of this advancement in terms of discussions in the community. It’s been a while since there was a lively discussion around academic integrity, right? And it’s been a while since we talked about what is good assessment? What is not? What is the conversation also around what is busy work versus what is analytical work that’s helping somebody develop their skills to be successful versus recall? So I really like that we’re having these conversations. I also agree on the aspect of there’s so many opportunities to help students with this technology. I think English as a second language is a great one. I think anyone who suffers from dyslexia, I think is another one. There’s just a lot of opportunities for people to have tools to better express what’s already in their head that we want to make sure we’re providing to students across the board.
I think it’s also really interesting the different pedagogy of having students actually grade and give feedback on ChatGPT generated text to learn about tone, to learn about fact checking, to learn about just digital literacy.
So there’s really good things about just giving students more models that they then learn from to say, this is different types of writing, what do you think of this? Because I think as citizens we need to do a better job of having the skills to analyze information and draw conclusions about it.
So all of those things are great. Now, in terms of the version and what’s to come, I don’t like to predict these things. I just know that Sam Altman in some of his interviews as well said, I think his words were, “If you’re thinking it’s a revolution, you’re begging to be disappointed.” He used that word, I’m like, “Begging for disappointment? I’m going to say that a lot. It sounds great.” I think he’s trying to temper the hype a little bit around it, and so we’ll see where it goes and we’ll be ready. And we have some inklings that it’s already being used in certain tools and so trying to see where we see those differences to be able to be a little bit ahead of it, and we’ll continue to do that.
I think that as a writing assistant, it’s such a great tool. Even before ChatGPT came out I had my daughter sit and use Jasper in front of me, and it was fascinating to see how they ask a question, oh, that’s interesting, and take notes. They ask a question. It’s a faster way to get information, to give you that creative thinking. All of those are positives.
Dr. Cristi Ford (33:23):
You both have hit on a couple things and I could talk with you all for another hour and a half, but I want to jump to this area around potential applications because Annie, you mentioned the places and talked about your daughter’s use. Linda, you talked about other industries. I do wonder, are there some potential applications of ChatGPT in specific domain areas, different products, and how soon will we see those and what are those processes look like from what you’ve been seeing?
Linda Feng (34:00):
I mean, the thought around this I think that’s where everyone’s gotten so excited about the possibilities. And I think EDUCARE’s had this great panel a little while ago where Tim Renick from Georgia State talked about the chatbot program that they had instituted, and how much it helped when you have the need to really supply that kind of support and supportive assistance to students all different times of the day. And that students might have even felt more comfortable to text this chatbot a question that they wouldn’t have asked a human. And so much of the ability to support students and really increase the reach and effectiveness of our existing overwhelmed university administrators, I think that that’s a huge boon.
I think there’s what’s already been brought up around multimodal practice, so to help neuro divergent learners and even educators. What if there was a way that as an educator, maybe I just need a little help in knowing what’s the right tone to message my students for something and I can get assistance to make sure that I’m saying the right thing. And so the idea of a virtual TA being able to offload some of those repetitive tasks that teachers and TAs have to do today, I think those are all, I think good things. But whatever we do, I think we have to think of all of this as really being assistive, so at the moment we cannot remove the human from the equation.
Dr. Cristi Ford (36:00):
Say that it gets blended for the educators. Say it loud.
Linda Feng (36:05):
Yeah. So I think that’s really important. And then I think the other thing is we have to make sure that we think about the ethics and that there is inherent bias in the data that is being used to train these models. And so we need to take that into account.
Annie Chechitelli (36:27):
Yeah, that’s a great point. And one of the things that we’ve done in our model… So there’s all these different models. There’s the model for the AI and then there’s the model for the detection. And so for the model for detection, one of the things that makes it fun and unique is we use student writing. Because there’s all kinds of writing, but really what are the telltale signs of student writing? And comparing that to what comes out of ChatGPT or another tool and seeing that pair of documents and how that’s different. And we are making sure that in our selection of student writing that we are representing all different communities, whether the community is lower income school or K-12 as well as HBCUs. We’re making sure that our corpus of which we’re using to create this is inclusive and that we’re not just choosing one sector or one segment of the population, thereby introducing a bias that we didn’t intend.
Dr. Cristi Ford (37:38):
That’s helpful, Annie, because one of the things I wanted to end talking with you both about is around the equity issue and the access issue. And Linda, earlier you talked about if everyone is promulgating the same fact, even if it’s not true, the system will pick up on that. And so I’m thinking about unfortunately we live in a world where hate speech is commonplace or there are things that are being communicated on the internet that have implicit bias and the system is learning from that as well. And so I’m worried about that and I wonder as educators being on the technical side, what are the things that we need to do to combat that? What are the things that we need to do collectively to really figure out how to remove that implicit bias from what the system has been trained on?
Linda Feng (38:37):
And that’s something that I think has been interesting because I think technology is what we have, is how we’ve arrived here. We can also think about technology as a way to help us attach the level of responsibility to where we’re going. So if you think about the the idea that if you just had your models picking up based on the number of occurrences and saying, well, then that’s the higher probability that that’s the next word. What if there was a way to somehow tag or provide upvoting in a way that could ensure that there’s some kind of curation step involved? This could give people better confidence, and you can see some versions of this are out there. Again, it’s not perfect because we are humans and we are not perfect, but Yelp, you can look through and sort and filter by the way in which people have put their answers in.
And most people, I think when you go to look at a Yelp review, you can see if one person had a really bad experience and talked a lot about it but then everybody else had decent experiences, you factor that into your overall assessment. I think that with Wikipedia, we also do have some controls around what is allowed to be in the Wikipedia, because we have the Wiki citizens, I think they’re called, that are allowed to pull things in and they have guiding principles for what is allowed, and it has to have the right references and so on. So I think that we as as humans can try to actually leverage the technology to make sure that the input sources get vetted properly. And this is again, I think a lot of debate about how does questionable content on Facebook and Twitter get moderated? But I think that that is the discussion we need to be having because we know that there will be that variability and there is a need for us to converge and have things that we can rely on for use.
So I think in this way this is a really interesting time to live through, and I think we also have opportunities. And I think Annie, you said this at the beginning, we know we may not get things right, but I think there’s a lot of learning and we want to continually learn from what we’ve gone through.
Annie Chechitelli (41:46):
And in addition to the bias, for me, we think about privacy as really important too. We could spend an hour on that, but I just want to make sure we use that word here, that putting a whole bunch of student information into a system that you don’t know, that’s something to discuss over time and different tools and how we use it. So that is top of mind for me personally and for us as a company. And the second is around equity for the tools. So I will not be happy nor will tolerate, will probably stand in the streets if it gets to a point where these tools are really expensive and that these students improve writing only in a certain socioeconomic demographic. That’s not okay, so we got to make sure that we have that conversation around access to some of these tools as they develop. And that’s another thing that we talk about at Turnitin and that I’m personally interested and I have strong opinions on.
Dr. Cristi Ford (43:01):
Annie, I’m going to stand in the street with you on that one because I completely agree. And so in talking with our last guest and other institutions, some institutions are talking about purchasing the non-free version for all students on campus. Because if we start to have a divergence of all of these tools no longer being accessible and free for all, are we creating another digital divide for students on institutions? Some have access and others don’t. And so just let me know what day on which street and I’ll be there. I really appreciate that. But listen, both of you, Annie and Linda have offered some really great contextual points around this conversation. As we leave today, I just wonder, is there a call to action that you would give to the listeners of this episode? Either a call to action or a word of encouragement or advice as a party word for this podcast episode?
Linda Feng (44:06):
I mean, I would say that it is important for us to be measured in how we are using this, so not to get caught into the hype cycle. There are efforts out there related to privacy and equity that are important for us all to participate in. So I know for example, that there’s an organization called EdSAFE AI, so it’s edsafeai.org and they are working to try to create good benchmark data around data sets and promote the ethical and responsible use of AI in education. So I think efforts like that would be really important for people here to get involved in.
Dr. Cristi Ford (44:59):
Really great resource Linda. Annie, thoughts from you?
Annie Chechitelli (45:02):
Yeah, same as Linda just turn down the heat and the hype and then we’ll have a rational conversation. I guess ask that everyone have a conversation about it and start to learn what the questions are and what are the most important questions to answer. And knowing that we’re not going to know all the questions yet, so how do we keep that open? I also ask, on behalf of a company, give feedback. I’m sure D2L-ers just the more you tell us, this worries me, here’s a new problem. For instance, we have a lot of teachers now saying, “Hey, this was never a problem before, but these references are made up.” Oh, that’s a problem we might be able to help with.
There are going to be new problems as well as things that we do that you don’t like, and we need to know that so we can change and we can continue to develop. So there’s that the ask for that involvement and patience at the same time as we’re all learning through this.
Dr. Cristi Ford (46:03):
Really, really great advice, constructive participation, being a part of the conversation. Annie and Linda, thank you for having this conversation with us today. We really appreciate your time.
Linda Feng (46:14):
Thank you.
Annie Chechitelli (46:14):
Thanks. It’s great to be here.
Dr. Cristi Ford (46:18):
You’ve been listening to Teach & Learn: a podcast for curious educators. This episode was produced by D2L, a global learning innovation company, helping organizations reshape the future of education and work. To learn more about our solutions for both K-20 and corporate institutions, please visit www.d2l.com. You can also find us on LinkedIn, Twitter, and Instagram. And remember to hit that subscribe button so you can stay up to date with all new episodes. Thanks for joining us, and until next time, school’s out.
Resources Discussed in the Episode
- Learn more about EdSAFE AI Alliance (Global leadership for AI in Education)
- Read “Understanding false positives within our AI writing detection capabilities” – from the Turnitin blog
- Read The Markup’s piece on decoding the hype about AI
- Read “Discrimination in a Sea of Data: Exploring the Ethical Implications of Student Success Analytics”
- Learn more about Turnitin’s Academic Integrity products
- Explore Turnitin’s AI writing and tools resource and information page
- Learn more about Dr. Cristi Ford and the Teaching and Learning Studio
Speakers
Annie Chechitelli
Chief Product Officer, Turnitin Read Annie Chechitelli's bioAnnie Chechitelli
Chief Product Officer, TurnitinAnnie Chechitelli is the Chief Product Officer at Turnitin. As the CPO at Turnitin, Annie oversees the Turnitin suite of applications which includes academic integrity, grading and feedback, and assessment capabilities.
Prior to joining Turnitin, Annie spent over five years at Amazon where she led Kindle Content for School, Work, and Government and launched the AWS EdTech Growth Advisory team, advising education technology companies on how to grow their product and go-to-market strategies with AWS.
Annie began her career in EdTech at Wimba where she launched a live collaboration platform for education which was ultimately acquired by Blackboard in 2010. At Blackboard she led platform management, focused on transitioning Blackboard Learn to the cloud.
Annie holds a B.S. from Columbia University and an M.B.A. and M.S. from Claremont Graduate University. She lives in Seattle, Washington with her husband and three children and is an avid tennis player.
Linda Feng
Vice President of Architecture, D2L Read Linda Feng's bioLinda Feng
Vice President of Architecture, D2LLinda Feng is the Vice President of Architecture at D2L. Linda comes to D2L from Unicon, where she focused on providing strategic consulting to leading institutions, major publishers and edtech companies related to integrations and learning analytics. Most recently, she helped several education institutions to design and build an analytics pipeline using AWS, as well as Google Cloud Platform with Canvas Data and Canvas Live Events.
Linda was formerly a Senior Product Manager with Instructure and has over 20 years of experience in database server and applications product development at Oracle. She has also authored articles and spoken at numerous education technology conferences, such as EDUCAUSE, ELI (Educause Learning Institution), IMS Global’s Learning Impact and OpenApereo.
Dr. Cristi Ford
Vice President of Academic Affairs, D2L Read Dr. Cristi Ford's bioDr. Cristi Ford
Vice President of Academic Affairs, D2LDr. Cristi Ford serves as the Vice President of Academic Affairs at D2L. She brings more than 20 years of cumulative experience in higher education, secondary education, project management, program evaluation, training and student services to her role. Dr. Ford holds a PhD in Educational Leadership from the University of Missouri-Columbia and undergraduate and graduate degrees in the field of Psychology from Hampton University and University of Baltimore, respectively.