Share

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus convallis sem tellus, vitae egestas felis vestibule ut.

Error message details.

Reuse Permissions

Request permission to republish or redistribute SHRM content and materials.

Why Younger Workers Teaching Generative AI Can Be Risky


A young woman in professional attire shows an older male executive a tablet.

Research shows that younger workers are often key for the adoption of new technologies in workplaces, sharing information and informally training their older colleagues. Now a new working paper published this month by a group of researchers at top institutions suggests using caution with this approach when it comes to generative artificial intelligence. 

The paper—titled “Don't Expect Juniors to Teach Senior Professionals to Use Generative AI”—is a follow up to a notable study released last September about an experiment with Boston Consulting Group staff. (That September paper introduced the idea of a “jagged frontier” between the tasks that genAI is good at or not, making it hard for people to know intuitively whether it is for any task at hand.) 

The new paper contains some other important, thought-provoking observations for how managers and leaders approach rolling out genAI in their organizations. We spoke last week with Kate Kellogg, one of the study’s authors and a professor at the MIT Sloan School of Management, and Hila Lifshitz-Assaf, a co-author of the study and professor at Warwick Business School. Here's a transcript of our conversation, edited for clarity:

This research working paper is related to the study that was published in September about BCG consultants?

Kellogg: Exactly. To give you a bit of context there, for everybody who had the use of generative AI for the experiment, we did an hour-long interview afterwards, and this paper is based on some background interviews. We found this interesting result that the consultants were concerned about: I'm going to want to use this, but I don't know how my manager is going to react. So for consultants in the problem-solving group, once we learned that, we started to zero in on that question, and that's what this paper is based on.

So this emerged out of those conversations because of that question and that concern. What do you think are the most important takeaways from this working paper for organizational leaders?

Kellogg: The most important thing is that organizational leaders are always tempted to allow their junior professionals to serve this informal role of teaching new technologies to their more senior colleagues. The reason for that, typically—and this is in the academic literature—is that junior professionals are closest to the work itself because they don't have leadership responsibilities, because they're not embarrassed about being seen as novices because they're not expected to be experts already. Then, thirdly, because they're not so steeped in traditional technology, so they're just more willing to experiment with new technologies. It's true in our work with other organizations, what we're seeing is this seems to be particularly true with generative AI.

What we said in our other paper is that there is a 'jagged technological frontier' that's rapidly changing. So it's harder to develop formal trainings. And organizational leaders are really counting on junior professionals who are going to LinkedIn and Twitter and learning and experimenting.

Our paper is meant to be a cautionary tale where we say, 'We know you're going to be tempted to do this, but we want to highlight what we see as the barriers to doing it.' The main headline in my mind is you're going to think this is an easy thing to do given the rapid change in the jagged frontier, but you need to understand what the obstacles are to that being your only strategy.

Why can't organizations just rely on junior employees to provide information and informal training for more senior employees around generative AI?

Kellogg: What we found is that junior professionals tend to fall into three novice traps. This is what we learned from the interviews. The first one is that even though they are typically doing more experimentation than the more senior professionals, they still do not have a deep understanding of generative AI. They often are not fully informed in terms of what the accuracy limitations are, the explainability limitations are of generative AI, and they also tend to underestimate how good generative AI is at contextualization or providing relevant outputs. So the first barrier we find is that they don't have as deep an understanding as the people who are really technical experts in generative AI do have.

So that's the first one: They're not fully informed of and don't have as deep an understanding of the generative AI accuracy and explainability limitations. The second one is—and this makes sense—they work at the project level with their managers. So they tend to think of solutions at that level. As they're experimenting with generative AI, they run into roadblocks. They tend to think of solving those challenges at the project level and around changing human routines within the team. So the second one is they tend to think in terms of human routines rather than system-design changes to address risks that are presented with generative AI. The third one is they tend to think of project-level interventions instead of firm-level interventions and what we would call ecosystem-level interventions, which means going back and forth with the developers to have certain requirements around the generative AI solutions. The three things are lack of deep understanding, tendency to think in terms of changing human routines, and tending to think at the project level. 

Your research also found that the more junior employees felt an obstacle to their ability to coach seniors would be 'the novel risks that the technology poses to outcomes that seniors value.' How significant is that?

Kellogg: What we found was that the juniors felt like these risks were less of a major roadblock than they thought their managers were. They thought their managers were going to be concerned about accuracy, explainability, relevant outputs, and automation complacency. They saw those as issues, but they just felt like they were more addressable and they worried that their managers would be less open to addressing them.

Lifshitz-Assaf: It's really interesting to think about also the difference in accountability. Of course, they would not think about those things in the same way because they're not accountable for them in the same way. And rightfully so. It's okay the way we have designed organizations so far, but now when we have this powerful tool that gives you the ability to do things and it's not always transparent, it's not explainable. It's very tempting to use it when you don't have the accountability. So everything that we saw is very natural, and from their perspective, they are pushing the frontiers of the technology. They're doing the best to learn how to use it. I'm saying this just to stress that I don't want to say that we're critical of the way that they're using it.

We think they're doing their best, but because of how the system has been built and how organizations and professionals have been built for years, it doesn't fit. This new technology doesn't have the ability for the managers to fully understand it, to supervise it, to see in a transparent way what exactly has been happening. So there is this interesting gap with this technology that did not exist with prior technologies. The junior professionals can use it, can use it fast and well, but it's not explainable, they do not perceive the risks in the same way that their managers do, and the managers have some opacity. They don't have the usual way of micromanaging, of the transparency that they're used to, the full understanding of what their junior people actually do to make sure that it's working well. It's an interesting challenge that this technology poses.

Kellogg: Hila mentioned that there's a difference because of their position. The reality is they have a different level of accountability than the managers do. One other thing we think is going on is that there's also a generational difference in how they view the potential for this technology and that the managers coming from a different generation seem to be more conservative than the junior professionals do. So we think it's both of those things.

Lifshitz-Assaf: We saw enthusiasm to a whole new level. We asked them also about emotions, and we have data on that, and it's amazing to see the enthusiasm that they have. So that's why I wanted to shine also a positive light on the juniors. They're enthusiastically adopting a new technology and really trying to do their best by using it. It's just that as they're doing it, they're also taking some risks that could be problematic for their organization. In a good and naive way, they're thinking about it: it's according to their roles, it's on the project level, it's on the level of human routines. That's what they've been trained to do. That's what they know. Now the question is what do we do with the managers who don't understand the technology as well, but have got to think about it and find solutions on the system level? That's why it's a fascinating tension that we found.

What are your recommendations to organizations in light of this?

Kellogg: Why don't we take each of those three things in turn? The first one is that the novices are lacking a deep understanding of generative AI. The areas that the juniors seemed to have a lack of deep understanding was in terms of what the accuracy issues were and the explainability issues were. Remember, to be fair, they had been experimenting with generative AI, but not in the work setting. They're just really learning this on their own. What we found is novices right out of the box when they're learning it on their own tend to overestimate the accuracy of generative AI. What we recommend there is that experts say that one important thing to do for an organization is to figure out what are the important sub-tasks that generative AI is going to be used for in your organization. And actually test the reliability of generative AI on each of those sub-tasks.

Lifshitz-Assaf: For instance, for the problem-solving task, the task was to recommend a CEO a strategic decision on whether to invest in a fashion brand, in the men's, women's, or kids' brand. It's a very clear analytical task with subtasks in it. The subtask thing is really important because each of those subtasks has different levels of reliability and accuracy of genAI. They had to analyze qualitative data, interviews from that company, news articles, etc. In the real world, it would be that you have some text to analyze on the company, then you have data, financial data of those brands of the company. Then you have to make up your mind and to make a recommendation, to do the analysis and to write it in a convincing way.

Each of these subtasks has different degrees of accuracy and different quality. Let's start with the end, the writing in a convincing way. We found that generative AI really helped increase the quality of writing. The problem was that it was regardless whether you're right or wrong. Even for those consultants that were wrong, they seemed to be better. So when CEOs or when their managers imagine getting their recommendations from their junior people, they could actually think they're doing better, that this is great quality, a really persuasive recommendation, even though it's wrong. The issue is that it wasn't easy to see if it's right or wrong when those consultants were working with a generative AI if they were fitting it with the data. Today, genAI is still not accurate on analyzing qualitative data, which we were surprised by ourselves. We thought it's a text-based tool, it should be best at analyzing those interviews. But actually when it summarized them, it does not know in the same way that we as humans know, to differentiate and discern what is more important, what is not, what is background, what is foreground. It missed a lot of those nuances and a lot of important data in the interviews. In the financial data, it depends on the complexities still. We have work showing that there are issues also with the accuracy there. Many consultants try to do it also on a spreadsheet and double check.

So, overall, you would map the workflow and you would say, 'How do we make sure that the qualitative analysis and the quantitative analysis, we don't do it just with gen AI or we double check it, or we build the process differently?' The writing, the recommendation even before that, the framing, the context of the question, that's great to use it. Give us examples like benchmarks with other industries, that's great to use it. But once it gets to real analysis, then you have to be careful.

Kellogg: That's on the accuracy. Then on explainability, what we found is that genAI was such a power persuader that even when these consultants tried to push back on generative AI, the generative AI would come back and say, 'I'm so sorry.' The consultant would say, 'Wait a minute, is that what they really said in those internal interviews? Did they really say that the women's brand was better?' And generative AI would come back and say, 'I'm so sorry. You're right. That person didn't say that. But still, it is better for these five reasons.'

Lifshitz-Assaf: It's amazing. It's designed like a power persuader, like a salesperson. It's designed to convince you. Really we were shocked. We see all this back-and-forth conversation. We're going to write more about it, by the way, later on, because we do think that that persuasion power is unique. Technology did not have that before. Predictive AI doesn't persuade you, it gives you an output, you choose what to do with it, but it doesn't keep on nagging you and persuading you when you criticize. There's not this back and forth, which is crazy.

Kellogg: We found that the junior consultants were persuaded by that. Yet experts know that, at least at the time of our study, the explainability was limited enough that you needed to just really not use the tool, except in cases where there was not a high degree of explainability required. Instead we saw the juniors saying you need to explain the model logic to the managers. You need to agree within the team on what's the right process for an explainable output. What the experts would say is it's an illusion of transparency, but it's not explainable, so you just need to not use it if you need explainability. So that was the first set. Then the third place where we saw juniors didn't have a deep understanding was they tended to think that they could only use it in cases where contextualization was not necessary, when in fact experts even at the time knew that you could use techniques like RAG to add context. This was actually really a strength of generative AI. What we're saying with the first one is there are experts on generative AI and you need to get them involved in order to address some of these issues, to figure out what are the best use cases within your own organization given the common problems that your people are solving. So that's number one.

The second thing is that the juniors tended to recommend intervening at the level of the project. For example, for accuracy issues, they said one thing we could do there is if managers are worried about accuracy, we can have them reviewing our prompts and responses. Experts would say that's ridiculous. It's just going to balloon to such a level it's unrealistic to think that your managers are going to be reviewing all of these prompts and responses. That's just not going to happen. For accuracy, instead, experts would say, set up a monitoring system to check to see whether the responses are in line with users' goals, use a model that provides link to sources, use a more accurate model. They would say, you really can't expect humans to be the double checks on accuracy. That's just not realistic.

Then another thing that the juniors tend to recommend was in terms of automation complacency. They said, 'We think our managers are going to be concerned about this. What we really need to do there is train users to take ownership of their work. We're professionals and it's just really important that these juniors right out of the box know that they're the ones who really need to take accountability.' What the experts would say is that's like a real human intervention. It's not wrong to train professionals to take ownership, but in fact, there's a lot of other things you can do at the system level. For example, you can change the interface to visualize uncertainty—that has shown to get people to be less complacent. You can design a system that provides self-reflective prompts. There's a lot of things you can do on the interface itself that really helps to address this automation-complacency thing that's going to be more powerful. Of course, you want to tell people to take responsibility, but that's not going to be the best way to solve that problem.

The third one is project level versus system level. What we saw the juniors say is, for example, for accuracy, they wanted to do a lot of gaining agreement within the team. Let's get agreement within the team where managers and the users agree on the conditions under which generative AI can be used reliably and the managers review our work process. Again, they're thinking that way because the reality of their job is they really do need to get their managers buy-in to what the use cases are. So it's not wrong. Yet it overlooks the fact that this is a really complex system, and there's a lot that can be done at the firm level and the ecosystem level.

For example, Hila was saying before about the power persuader nature of generative AI. That's a perfect example where developers are anthropomorphizing the AI and making it come across as a power persuader, but it doesn't need to be designed that way. There could be some things where the firms aren't really saying here are our requirements. If we're going to use generative AI in our set of firms, we need you to do things. Like, first of all, we're worried about your data sources. We need you to tell us how credible the data sources are and tell us that you're providing real time updates on data sources. We want you to flag and correct misleading responses. What we've seen so far is that firms have been in the passenger seat of just taking on these generative AI products. But now as we see more competition in the market, we think there's going to be a way for firms to intervene with developers around some of these issues like accuracy and complacency.

One thing they can do is intervene with developers and then also even at the level of the firm, there's a lot of things firms can do, like provide a prompt library to have better contextualization. They can provide these audit tools to allow users to audit the responses. Just assuming that these junior professionals are going to be the ones that come up with solutions, it's almost like taking away the responsibility that the leaders themselves have for providing the right infrastructure in their company and for interfacing with developers around the design of the tools.

Some of the solutions that you're suggesting are about educating younger workers and people generally about the technology. Should we draw the conclusion that younger workers should not be relied on to provide informal training for more senior workers, or is it that younger workers should be better trained themselves on the limitations and strengths?

Kellogg: Given the rapid change in the jagged frontier, there has to be rapid experimentation and we should have people learning from juniors. They're the ones close to the work, they're doing experimentation. What we're really saying with this paper is that's what we're seeing for recommendations out there for what leaders should do is tap into these informal channels, do peer-to-peer training, let people learn from their peers. And we're saying, absolutely, yes, you have to do that, but don't give up the responsibility you have yourself for making changes at the system level, for providing some knowledge to people on what these novice traps that we're going to see people fall into and that we think are predictable and could be addressed by us knowing what the common traps are. We're saying avoid the common traps with training and make sure you do things at the firm level and with the system developers and don't just rely completely on the junior professionals, even though this is so uncertain and so rapidly changing that may be what your immediate reaction is. 

Should organizations more aggressively provide formal training in generative AI that doesn't have some of the blind spots that you've been talking about?

Lifshitz-Assaf: Actually, it's really problematic to expect proper training right now in the way that we have on other technologies. It's very early and it's changing rapidly and it's jagged. Of course we should have proper training and we should co-develop that, but we cannot sit back. With our paper, what we want to do is not to say don't trust them. Just go to Accenture or McKinsey and ask for their training. The idea is to try to wake up to the new responsibility that you have as a manager. Your junior people will not be enough. The experimentation that Kate mentioned is key because it's not something that you can work with an optimization mindset that 'I will know for sure for the next year which sub-task I should do with AI or not.' No, it's going to change. So you're not going to invest with the old-school-type of training for three months and then we'll have a certificate and in a year everyone, all my workforce will know. No, it has to be much more agile and it has to be experimental. That's a deeper change that needs to take place. And on this experimental stuff, we don't mean just trial and error. It has to be a structured experiment. Think about what we did with BCG. They're actually now teaching other organizations to do it, and many organizations are trying to systematically experiment with their tasks, to take strategic tasks, to experiment in a systematic way, to have different learnings from different units that do their experimentation, in a central unit that learns and that transfers that knowledge.

Kellogg: We do like the idea of experimenting with teams. If you could envision this structured experimentation where you have a lot of teams, for example, in a consulting firm using generative AI in different ways, but then you need to have a central group that's tracking all these experiments to see which ones are most effective. For the ones that are effective, scaling those ones throughout the organization. In addition, what we're showing with our paper is we're not just talking about experimenting with the way people interact with one another on teams. We're talking about experimenting also with the data, the model, the systems themselves, and really trying to integrate experimentation on what we would call the material aspects of things with the human aspects.

The one other thing that we think is really important with experimentation is that that central team be staffed by any position that's involved in generating output from the generative AI. One thing we found in other settings is that especially people who are, for example, in an administrative position, might have something very important to say, but they're not going to say it unless they are actually given a position on the central team to say, 'You know what? You may think you want to scale that, but here's what's going to happen to everybody in the department in our position if you do it that way.' So we need to have the structured experiments and we need to have the central team be staffed by every position who's involved in using the technology for a solution.

Your research identified a perception by more junior workers that the senior staff might feel like generative AI was in conflict with the outcomes that they value. Beyond the context of coaching by junior employees, are there implications from that for the piece and extent of adoption of AI tools within organizations? Essentially, are the junior workers flagging a point of resistance to the use of AI tools more broadly? And is this a signal that progress might be slower than some people initially would expect?

Kellogg: Yes, we really think so. We think that a lot of the writing on AI is done in experimental settings that are not field experiments like ours. A lot of it is all about whether the individual makes a decision to use the AI or not. You see all kinds of things around what are personality differences, what are differences in the system that would lead me as an individual to choose to use AI in my work. What we're showing with the field experiment is this is a social system. This is not an individual in a vacuum making a decision. This is an individual within a team and an organization. Then the decision is very much informed by this whole group of people. So absolutely.

The answer to facilitate that is on the same way we're saying we want juniors to be shown here's where you might be really pushing things too fast and be ill-informed in these particular areas, we're also saying with our other experiment on the other hand, look at these crazy ridiculously wonderful productivity gains you can get when you use AI for the tasks to which it's well suited. We need to show the junior professionals and the organizational leaders here are the limitations. And we need to show this manager level you'd better move really fast because here are the incredible, amazing productivity gains you could get that you want to make sure you don't miss out on. 

That's nuanced. I can imagine it's challenging as a leader of an organization to simultaneously deliver those two messages.

Kellogg: Yes, that's why we call it the jagged frontier. That is a really difficult thing for professionals to know. If you think about this frontier, where inside the frontier these are the tasks for which AI is really good at and outside the frontier are the ones AI is going to stumble and it's jagged. It's very difficult to know is an individual professional where you are. That's why we're saying is there have to be these systematic experiments where you're testing where you are with all of these key tasks and key workflows that you're doing on a regular basis, given your industry and your customer base and your company. 

Charter has researched the engagement and anxiety about the use of AI at work and have found gaps by gender, race, and ethnicity, and age. Is that something you looked at in your research?

Kellogg: We've got follow up to look at what's the variation in this right now. This paper is really reporting the main effect, which was that it was really striking and surprising to us that these junior professionals were not going to be the clear answer here in terms of training. Now our next step is to look at the variation and to see does this vary by type of professional, by those kinds of demographics that you just said? Also on the manager side does their true willingness to embrace it vary and what the professionals think about that. That's a future study for us.

Article by Kevin Delaney, CEO and editor in chief of Charter. Copyright 2024, Charter. This article is reprinted with permission from Charter. All rights reserved.

Advertisement

​An organization run by AI is not a futuristic concept. Such technology is already a part of many workplaces and will continue to shape the labor market and HR. Here's how employers and employees can successfully manage generative AI and other AI-powered systems.

Advertisement