Have you ever wished you had a duplicate version of yourself? If the answer is yes, this episode with Jon Tota is for you!
Jon has spent his 25-year career at the forefront of all aspects of online digital media, including what’s in this episode! You’ll learn about photo-realistic digital replicas of people using A.I. These are also known as digital doubles or twins, synthetic avatars, and people’s digital likeness.
Jon started as a screenwriter in the early 90’s and then went on to design computer networks and trading systems for Wall Street ﬁrms. Jon co-founded Edulence in 2002 and launched Knowledgelink as one of the first video training platforms in 2004. Edulence was acquired in 2020 by ELB Learning to make Knowledgelink the Learning Management System for one of the most trusted brands in the Learning & Development space.
Fast forward to today, Jon creates immersive audio and learning content at Syntax + Motion which produces online courses, interactive video series, and podcast shows.
He’s also at the forefront of A.I. synthetic media as CEO of Render. Render provides Digital Likeness solutions for personal brand businesses, thought leaders, influencers, and business leaders who want to leverage their synthetic, A.I.-powered Likeness to communicate more effectively online and create audio and video content from anywhere without going on camera.
As a sponsor of this podcast, Render is going to be creating a digital replica of Helen’s image and likeness, and an accurate clone of her voice so she’ll be able to create hyper-realistic avatar content with ease later this summer and we couldn’t be more excited about it!
We’ll also be bringing a content capture event to Cincinnati in October so sign up for our free newsletter to stay in the loop!
In this episode’s conversation with Jon, you’ll discover what digital doubles are, their application use cases in our 2D world, the ethics around them, the power of prompting from a developer’s perspective, how ChatGPT can help writers, how A.I. can break down barriers of entry for people to pursue new creative outlets, and why Jon always starts with a script.
Jon has worked in the synthetic media space for the last three years and realizes it may be a new concept for many. Sometimes the avatars that he creates are known as digital twins or digital humans.
“Simply put, it’s essentially generated version of you.”Jon Tota
It’s a digital replica of yourself that looks very lifelike and mimics your voice. Once you have your avatar, you can type in scripts and then it delivers your message through video. It can be unsettling to see your digital twin for the first time since it’s so realistic!
Jon appreciates the fact that synthetic media and A.I. tools help people create content more quickly and cost-effectively. However, he doesn’t want there to be an inauthentic relationship with the technology and the audience. We should present avatars as tools for communication rather than trying to pass them off as real people.
He wants to help people figure out how to use their avatars, introduce them, and name them in an ethical, authentic manner.
Jon told us that there are two worlds in the synthetic media space: the stock avatar space and the digital likeness space. The stock avatar space is where technology is most widely used for large corporate organizations and is a great tool for creating multiple versions of videos at low cost. However, there are challenges with fake news in this space. He gave the example of the deepfake using Tom Cruise.
In contrast, the digital likeness space, where his company Render operates, involves helping people replicate themselves with their permission and use their digital likeness in place of themselves. This is different from deep fakes or fake news, which can be done with stock avatars. Render verifies that the user creating the digital likeness is truly them, and the application validates the user’s identity based on their profile content. The digital likeness allows for more creativity and the ability to create more content. The process takes about 90-minutes including taking video and cloning the voice. He does see that voice between the “real” you and the “digital” you is going to be almost indistinguishable soon since the technology has become less expensive. His clients are mainly small business owners and those who want to expand their personal brand like a professional speaker who’s trying to book an event or a financial advisor. People have a fun time creating their avatar and it’s a cool process.
Avatars require one thing to perform, which is a script. ChatGPT can be instrumental in helping write the script for the avatar to say. As a writer himself, he sees it as essentially having a virtual writing partner. Scripting is so important to the process of using an avatar that now you’re not scripting alone, you’re scripting with a virtual writing partner that can help you through it and avoid the dread of the blank page. It’s limitless, the results you get back to it. Furthermore, Jon believes that ChatGPT-based prompting can level the playing field for smaller development teams, allowing them to build applications that would have taken years with larger teams.
The prompts are the most important thing when it comes to ChatGPT and people must learn how to prompt effectively. He thinks about prompts from an application developer perspective.
Jon explains that you can take a three-minute script, put it into a preset prompt into GPT, and end up with a 30-second clip for social media. It would take a writer hours to make those edits. For example, he writes a script from his own perspective that’s serious, and through ChatGPT can shorten it, make it funny, and appeal to a certain audience. But people need to know how to give the right prompts. Jon emphasizes the responsibility of application developers to understand their users’ needs and integrate the right prompts to guide them effectively.
Jon is after efficiencies in his work and believes that what people should lean on A.I. for. You could use A.I. to basically plagiarize your way to a first draft but that’s the wrong use. It should enhance your work but not completely replace it. While Jon values A.I., he does understand the concern people have.
Intellectual property really becomes compromised in almost every way. But then again, we as a society have put all of our thoughts and work out on the internet for public use. You can’t blame A.I. for aggregating stuff that you put online and teaching itself to be you. The problem is particularly challenging for well-known thought leaders who have a lot of content online, as their identities can be replicated by A.I. It’s a side effect of putting great content online. It’s problematic when bad actors try to pass it off as from the original source.
There may be a technical fix to protect content online but it’s difficult to stop the continued development of A.I. models and the growth of content on the internet. He mentions the ongoing debate around slowing A.I. development but the train has already left the station.
Jon believes that A.I. will democratize application development and make it accessible to more people, just as cloud computing did with Amazon Web Services. He envisions a future where people can sketch out a wireframe of a website and use A.I. to code it, allowing for more efficient and faster development.
For production companies with slim margins, A.I. gives them a competitive advantage. It may not be sexy to talk about but A.I. will have a huge impact on video production (not just negative) because it’s a shot to make real money. And that goes beyond media production.
Anybody with a good idea can build a software application now delivered online and charge people to use it. You don’t need to be an engineer. Jon considers it “one of the coolest things ever.” Small businesses in general can use A.I. to improve efficiencies, scale their work, and profit from it.
While there will be new hurdles, there are also new paths for people. A.I. removes a barrier of entry to certain creative pursuits.
Synthetic avatars will continue to get more and more lifelike. At some point, he sees the capability of avatars as “extras” in a film completely generated by A.I. There’s only so much you can do with an avatar but then you can dynamically generate and add visuals around it and music, making the video more interesting. Jon also highlights the accessibility of A.I. services through open APIs, which allows for innovation and easy integration into user experiences.
He doesn’t dare predict beyond the next year because the space is moving too quickly. But he sees an innovative space that can open doors. People don’t need to figure out what stability is already figured out but just decide how it integrates into their user experience and how they can add value to the process. A.I. can do something it would have taken years to do and cut the timeline.
At the end of the day, our guesses are as good as his on what the future holds! However, he does believe it’s an exciting one.
Jon, we appreciate you being our guest on Creativity Squared.
This show is produced and made possible by the team at PLAY Audio Agency: https://playaudioagency.com.
Creativity Squared is brought to you by Sociality Squared, a social media agency who understands the magic of bringing people together around what they value and love: http://socialitysquared.com.
Because it’s important to support artists, 10% of all revenue Creativity Squared generates will go to ArtsWave, a nationally recognized non-profit that supports over 100 arts organizations.
Join Creativity Squared’s free weekly newsletter and become a premium supporter here.
Jon Tota: And this is like a public service announcement. We can teach your audience how to spot an avatar in the wild.
When people think of avatars, they often don’t think of digital clones. They think of something a little cartoonish or video game like, but we now have the technology to produce hyper-realistic avatars of your digital likeness.
Much like me, I’m John’s digital clone, and it’s great to get to share about this form of synthetic media with all of you.
Helen Todd: Have you ever wished you had a duplicate version of yourself? If the answer is yes, this episode with John Tota is for you. That was John’s digital double that just made an appearance, and you can see how it looks on the episode’s dedicated website page at creativitysquared.com.
John has spent his 25 year career at the forefront of all aspects of online digital media, including what we’ll be discussing on today’s episode, which are photorealistic digital replicas of people.These are also known as digital doubles or twins, synthetic avatars, and people’s digital likeness.
John started as a screenwriter in the early nineties and then went on to design computer networks and trading systems for Wall Street firms. John co-founded Edulence in 2002 and launched Knowledgelink as one of the first video training platforms in 2004.
Edulence was acquired in 2020 by E.L.B. learning to make Knowledgelink the learning management system for one of the most trusted brands in the learning and development space. Fast forward to today, John creates immersive audio and learning content at Syntax + Motion, which produces online courses, interactive video series and podcast shows.
He’s also at the forefront of synthetic media, a CEO of Render. Render provides digital likeness solutions for personal brand businesses, thought leaders, influencers, and business leaders who want to leverage their synthetic AI-powered likeness to communicate more effectively online and create audio and video content from anywhere without going on camera.
As a sponsor of this podcast, Render is going to be creating a digital replica of my image and likeness and an accurate clone of my voice. So I’ll be able to create hyper-realistic avatar content with ease later this summer, and I couldn’t be more excited about it.
We’ll also be bringing a content capture event to Cincinnati in October, so sign up for our free newsletter at creativitysquared.com to not miss out on any of the fun.
In today’s conversation with John, you’ll discover what digital doubles are, their application use cases in our 2D world, the ethics around them, the power of prompting from a developer’s perspective, how ChatGPT can help writers, how AI can break down barriers of entry for people to pursue new creative outlets and why John always starts with a script.
Theme: But have you ever thought, what if this is all just a dream?
Helen Todd: Welcome to Creativity Squared. Discover how creatives are collaborating with artificial intelligence in your inbox on YouTube and on your preferred podcast platform. Hi, I’m Helen Todd, your host, and I’m so excited to have you join the weekly conversations I’m having with amazing pioneers in this space.
The intention of these conversations is to ignite our collective imagination at the intersection of AI and creativity to envision a world where artists thrive.
John, welcome to Creativity Squared.
Jon Tota: Helen, thank you for having me. It’s great to be here
Helen Todd: We had such a wonderful and amazing conversation the first time that we connected, and I’m so excited to have you here today to share all the amazing things that you’re up to in the AI and synthetic media space.
A good friend of ours connected us and as soon as she heard about my project, she was like, oh, I know exactly who I need to put you in touch with, and I’m so glad that she did.
Jon Tota: Yeah, likewise. I love what you’re doing with the show. I think it’s, it’s an important side of the AI discussion and I think it’s really fun.
So excited for the launch and, and really happy to be on the show.
Helen Todd: Thank you. Well, and for our viewers and listeners,who aren’t familiar with you, can you kind of do your elevator pitch of your background and how you kind of came into the space where you’re at today?
Jon Tota: Sure. So I, I really started, well I started as a screenwriter, so I, I love video and film and creating online content.
When I got started with online video was, you know, kind of going all the way back to like early 2000s. We had done training primarily at that time in the financial services industry. And our idea was like, we couldn’t really afford to burn CD ROMs or DVDs and ship them out to people. So we started putting our training videos online.
I loved it cuz I wanted to write content. Ideally I wanted to write feature length films, but I ended up scripting training videos for, for the financial services space. But that introduced us to distributing online video. We wanted to do it in video back then, that was before YouTube launched.
So having a video training delivery system wasn’t really commonplace. It was a little hard, a little bumpy, convincing people to do that early on. But it got us into that. And when then we realized that we loved building the software platform that delivered the content probably more than creating training videos.
Also, I think we only had about 20 good training ideas in our head. So, so we switched and really became a platform for companies to train their employees with online video. That’s evolved over time. And then one of the biggest challenges for us was always, you know, people loved the idea as the platform scaled and the technology caught up and, you know, later in, you know, around like 2008, 2009, really everybody was comfortable with online video and, and our platform really did well in that space.
But the challenge was always that people didn’t really have that much content and then you had to get them into the studio, to produce online, or to produce the video and the training materials just to put them online. And so then we got heavily involved in the production of content, just kind of as a necessary evil.
And, and I think, you know, over time that’s what really led us to synthetic media and all these great AI tools because, you know, at the end of the road here now we’re, you know, 2023. And if we had these tools to help people create content quickly and, and very cost effectively, you could, you could have got so many more people creating content and putting it online.
So that kind of led us to, you know, the, our production company, which is Syntax + Motion. Was kind of a spinoff from the platform company we sold, Knowledgelink that was the platform back in, in 2020. Syntax + Motion was kind of like the internal production company of that. And then that led us to really synthetic media first and then, and then AI tools, during the pandemic cuz people couldn’t come into the studio or produce.
And, and then we spun that off into, into Render, which is really the business today that’s focused entirely on helping people create avatars of themselves and use AI to create online video. So that’s where we are today.
Helen Todd: Amazing. And for those who don’t know what synthetic media or digital doubles and what these avatars are from the stock to the custom, can you kind of explain to everyone listening what, what this is? Cuz it might be new to them.
Jon Tota: Yeah, so it is, and, and you and I talked about this a little bit for me. I’ve been working with synthetic media for probably the last three years, so I do sometimes you, you kind of roll over that quickly and then you think like, okay, wait. A lot of people haven’t seen it at all.
The first time they see an avatar like this or you know, there’s so many different names like, you know, digital humans or digital twins or since, you know, and so, we’ve probably in our marketing, have tried all the different terms, but in reality what it is, is essentially a digitally generated version of you.
At this stage, the very best versions are, begin with a certain amount of video footage that we capture of the individual. We also clone their voice and then using our application from that point forward, they can type in scripts and then their avatar delivers that message in, in your voice, in a photorealistic version of you.
And, and so, yeah, so that’s essentially what it is. A digital replica of yourself that looks very lifelike. Probably more so than most people know or are comfortable with, right in the beginning, I guess.
Helen Todd: Well, I know when we spoke, it was right before South by Southwest, and I was so excited that I think everyone I met at South by Southwest heard that we’re gonna create a digital double of myself.
And as soon as I was like, this means my digital double will be my TikTok strategy where I can feed content and have, I think I’m naming it like Helen 2.0 Odd. So it’s Helen 2 odd. Yes. Or something like that.
Jon Tota: Oh, right, right.
Helen Todd: As soon as I was like, I never am gonna have to create a TikTok video, everyone was like, I want one. Where do I sign up?
Jon Tota: Yeah. And, and I think we’ve seen a lot of very interesting avatar names at this point. So, you know, and that’s, that’s one of the things, and, and I, I think it’s one of the aspects of it that people are concerned about. But also what’s what’s kind of interesting in, in our space that like we’re kind of helping people figure out how does this avatar fit into your communication strategy?
And where do you use it? When is it appropriate to use an avatar and when should it be you in, in real life? And the other side is like, how do you present it? We, early on, I think a lot of people wanted to create a replica of themselves to save time and not necessarily tell their audience that it was an avatar.
And I think that sets up kind of an inauthentic relationship with the technology and, and your audience. And then we kind of evolved to helping people introduce their avatar as like, you’re, like you’re saying Helen 2.0 and like a member of your team, someone who can communicate with you in video. Cuz we all love to communicate in video and be able to to market ourselves that way.
But to be able to do it without you having to go on camera all the time. And that kind of depends on you introducing your digital twin or your, your replica in this case as like a, a member of your team or a tool for you to use. And then people are much less critical of its performance and that it’s not as lifelike and animated as you are in real life.
But it’s also kind of the more ethical way to, to put your avatar out there. And, and so that’s, yeah. So it’s an interesting part of, of naming your avatar and how you introduce it to the world and all of that.
Helen Todd: And I think it’s so fascinating that you brought up,you know, the transparency, how to introduce, if you introduce, avatars.
And I know that there are some celebrity examples out there that people might have heard of where it kind of, I guess punctuates this point, but how, what are some of the ethical issues that you’re thinking about related to, to this technology?
Jon Tota: So, you know, and we’re in a different space, I guess. So this is where, when I look at at least the synthetic media space that we’re in, there’s kind of two worlds.
There’s a stock avatar space, which to us, we feel is pretty routine at this point. Cause that’s where the technology was really used most. And it’s where it is today, most widely used for large corporations. And, and it’s a brilliant tool for that perspective, right?
You’re a, a large corporation with 10,000 employees all around the world, and you wanna create multiple versions of a video very easily in different languages with culturally appropriate talent at low cost, right? And have an instructional designer be able to swap those, those, you know, those actors out as they need to or update the content very easily. So great situation for that. And that is the stock avatar space.
And, and I think that there’s a little bit more of a challenge in that world with fake news and what you’re, I think kind of mentioning like deep fakes, deep fake Tom Cruise, right? Like the most popular one that people talk about, very different in how that’s done because there are stock avatars that are designed to be used this way.
The actors are compensated for their avatars being used in that way. And then there’s that deep fake space that is really what we all have heard the most about from a consumer perspective at least to this point. And that’s where they essentially have like an impersonator act like Tom Cruise in this case, and then they use Face Swap, to swap it in and make it really look like it’s truly him.
Now that is, you don’t have their voice and you don’t have their actual footage. Although the tools have gotten so advanced now that you don’t really need that actor’s permission, there’s so much of their content out there that you will be able to clone them very easily.
But what we work on is, and we like to call it digital likeness to kind of separate us from the stock avatar space of helping someone digitally clone themselves. And now that’s with their permission, they have to come into the studio, then in, in person to capture the footage. They have to sit with us in a studio location, ideally to record the audio that we use to clone their voice. And so now you really have their permission in that process, we verify them and make sure it’s truly them, obviously, that are creating it.
And then in our application we can validate that user based on the content that we have in their profile, that they truly are the person that they’ve uploaded a photo of or something like, you know, in that nature.
So yeah, so for us it’s digital likeness because it’s really helping people replicate themselves and then use it in place of yourself. But not like a deep fake and not fake news, which could be done with stock avatars. But really for this idea that you can create more content and you can be more creative in more different ways if you had access to a tool like that. And that’s, that’s kind of our corner of the world right now, at least.
Helen Todd: Yeah, it is so cool and fascinating and it feels surreal, like we’re living inside of a sci-fi movie with bringing our digital doubles to life and whatnot. When I was in Texas, I actually went early to, to visit some family and my 90 year old great-aunt and 92 year old grandmother, I was like so excited to talk about, you know, our conversation and was trying to explain these synthetic media and avatars. And they looked at me like I was an alien and I was like, oh, you know, kind of like picking out avatars from games and still nothing.
So it’s, I think the world that you’re living in is really at the, at the forefront of what, what can be done on, on this end. And I’m also just really excited to have a synth made of me.
So you mentioned a few applications of like large companies. I know it’s someone at South by Southwest who works for the United Nations. They do, I guess hosts like 700 webinars a year. Instead of their team doing it, they’re now using, I don’t know if they’re custom or the stock synths, but are there any other like interesting applications that you’ve come across or working on with clients or your own studio on?
Jon Tota: Yeah, so for us, we tend to work on the more at the individual level as opposed to a large enterprise. And maybe those individuals are sponsored by a large enterprise, but our use cases are more tied to someone who wants to expand their personal brand, and use their, their digital likeness to do that.
And so yeah, so our use cases are really interesting actually because you’re seeing people use it particularly like small business owners. People who are in, in our case, a lot of them are thought leaders, influencers, professional speakers, authors.
We’re starting to work in some business situations where they sell on their personal brand, realtors, car dealerships, financial advisors, and, and so where an enterprise might use those stock avatars and their use cases are gonna be, you know, pretty amazing that they can train so many people and be efficient and run webinars like that in different ways.
For us, it’s, you know, social media you see people using, as you were talking about earlier, being able to create volume content, without having to go on camera all the time to use on TikTok or things like that.
I, I think for us, it’s kind of cool to see how people use them in their sales process, where you might, even if you are, say a professional speaker and you’re trying to book a, an event, you wanna send something out, you have a presentation and now you can, customize that presentation, drop little videos of yourself in there, kind of presenting it.
And so you see some cool things like that, some things that people can do at scale because you couldn’t record that video a hundred times over for each one of the clients you’re talking to. But it’s very easy to rerun that script a hundred times over and just change the person’s name each time.
And so, so those are some of the interesting things we’re seeing. I think we’re amazed all the time because now we’ve produced, well, I’ll roll back a little bit. So, we started Render officially, you know, it was still under the Syntax + Motion umbrella last year, but really last year about this time.
We had done, you know, maybe a handful of avatars where it had produced these, these, digital replicas of people up until that point. And we were having a hard time getting people into the studio because that’s a big obstacle for us because in order to get one of these done, as you know, so we’re talking about doing yours, you have to come to the studio and we need about 90 minutes in the studio.
We do hair and makeup, we capture the video training footage, and then we capture the voice clone. That’s hard to get people to Burlington, Vermont, where we’re located. So, last year about this time, we decided that we would launch this studio tour, at the first one was at an event for the National Speakers Association, and we tried it out there, popped up a kind of a mobile version of our, our studio here in our headquarters, popped it up at the event and we did about 25 avatars, 24 avatars in three days of a conference.
And like the light bulb went off and we’re like, wow, this is really cool. Like, if we bring the studio to people, we can get that obstacle outta the way and get them using an avatar more quickly. So from there, and over this past, I guess now it’s about 10 months, we’ve done a studio event like that in a different city every month. And we’ve produced in that last, you know, 10 months, nine or 10 months, we’ve produced 110 avatars.
So that’s really cool because nobody really does that volume of custom avatar generation and producing them, taking people into the studio, teaching them the right way to perform on camera so that the training footage really works well.
And then generating it with our partner who runs the AI model that creates them. So it’s been really cool to see because you are learning the whole time, like the right way to capture footage so that the avatar comes out lifelike enough, but not so lifelike that it blows up.
You know, that’s kind of one of the interesting things when you deal with AI. These avatars are generated using an AI model. And we don’t do that. Our technology partner that creates the avatars is Synthesia, and you see them a lot. They’re really big in the stock avatar space.
But we reached out to ’em and said, we’re really not interested in stock avatars.We wanna make custom ones. We were one of the first customers of theirs, a few years ago creating, who started creating stock avatars, I mean, custom avatars. And so, for us, we’ve kind of really scaled with them and they’ve been a great partner from this perspective.
And just learning, like there’s things when we upload that, that training footage with them into the model that if people do certain things with their hands or with their face or there’s jewelry in a certain location like earrings, hanging earrings, it can blow up the AI generation of the avatar.
So it’s really cool. There’s a lot of really funky outtakes from the generation of these avatars. But yeah, so it’s fun. It’s a really great learning process. And then it’s really cool to put it in the hands of people and see what they do with their avatar once they’ve, once they’ve got it.
And, yeah. So the process is fun. It’s really amazing for people to see it the first time. We, I think, we don’t think it’s as interesting as it is because we’ve done it so many times in this last year. But it’s a really cool process for people to go through on their own.
Helen Todd: Oh, it, it’s so cool, Jon, I tell everyone about this, that I meet.
And I guess, for the listeners too, I’ll be sure to include links if you’re interested in your digital double or replica to see where the tour is going. I know Jonn and I have talked about bringing it, one of these capture events to Cincinnati, later this year, which, you know, sign up for the newsletter. So you, you know, you’ll be the first to know about all of that.
So one thing that you had said about the digital doubles, which I thought was kind of interesting, was kind of like, what is it, what is the term Uncanny valley? It’s like, you know that they’re real, but then they do some things as like, what is that?
Or they just saw like pauses and ums. Can you kind of tell us about some of the, you, you touched on it already a little bit, but some of the things where it’s like, that gives them a way that they’re really not human when we see them?
Jon Tota: Yeah. Yes. Yeah. And this is like a public service announcement, so I, we can teach your audience how to spot an avatar in the wild.
Helen Todd: Well, reality’s gonna be so weird from like, what’s real and what’s not real. And if people have never like seen these before, you’re gonna see them on this show and you’re gonna start spotting them everywhere now especially now that you’ve heard of them.
Jon Tota: Yeah. And I, and I think now at this stage of the technology, so we started in 2020, kind of mid-2020, getting involved with the technology, really launched our first offerings around this in ‘21, launched our application in ‘22. And so that’s been out there for this last year. And now we’re really advancing on the application side quite a bit.
But we’ve seen it kind of progress quite a bit. I mean, each quarter the technology gets better and better. And to your point, I think three years from now, maybe even faster than that, maybe a year from now, the rate things go, you probably won’t, the things I’ll tell you now, you probably won’t be able to spot that any longer.
Because voice, I would say, if you had asked me this question even three months ago, I would’ve told you that the voice is the way you spot an avatar in most, in most cases cause it’s very hard. I think it’s easier to create the replica of someone in that, in a photorealistic way when you work from that video footage than it is to clone someone’s voice.
But there’s so many voice AI tools out there, that was one of the first big areas of this synthetic media space. And they’ve gotten really good in, in these past six months or so, to the point that you can clone off of very little data, audio data now.
And it was a problem in the beginning, and this is, was the way you’d spotted, is that the AI in the early stage didn’t know the context of the sentence. And so it couldn’t do inflection up or down. And so we had some, a financial advisor using it. And it was kind of funny cause the market was really not doing well and he had a market update go out and it said, boy, what a week it’s been. And, and you’re like, no, that’s gotta be like either boy, what a week it’s been, or boy, what a week it’s been.
And so we worked so hard in the early stage, of capturing inflection and emotion in their voice. And it would take us, you know, an additional 30 minutes, 40 minutes of recording time to capture that. And then we had to manually train each person’s voice to be able to do that.
In the last few months, that’s all been built into the, an API service. And now you capture 10 minutes of training footage, and it can even put breaths in, in between sentences, inflection up and down based on the context of the sentence. So the AI capabilities have, as you know, from everything you’re seeing in the news and in the industry has gotten so good that the voice AI, I think, is almost indistinguishable.
And it has been at very high price points up until now, but now it’s available at that level at really inexpensive levels. So I think voice, anything I would tell you today would not be true a few months from now because voice is gonna be almost indistinguishable soon.
As far as avatars go.and we were just, we were just hired by an agency to produce the worst avatars possible cuz they were, their project is to kind of spoof avatars. And so they were referred to us because I guess we should be the best at making bad avatars now.
Helen Todd: That’s funny.
Jon Tota: Yeah. So,so I just did this process of like, okay, what are all the outtakes? If you’re looking, obviously the body movement loops. So it depends on the type of avatar.
We deal primarily in Synthesia video avatars. And so in that case, you will see the avatar do the same things over and over again, because it does generate off of a loop of video content. So you’re looking for that.
I think one of the things that we try to eliminate is too much jewelry on their wrists because it loops footage and you bring hands back together. And sometimes it, well, they call it like ghosting, where the hands will, you don’t bring them back to the exact same location, and so they just disappear in the loop. And so if you’re watching someone, you’ll see it’s very faint, but you’ll see their hands disappear at moments.
And then the one thing that’s actually really funny, is they call it, gulping. And it’s avatars tend to like go like this when they’re waiting for words. It’s a really weird effect, and they, they’ve just done a new release on the model and it should fix that more. But you’ll notice, if you’re watching avatar footage, they’re, they have to the, the avatar’s waiting in, like, like kinda like this dead zone for the next, like, set of words to start.
Helen Todd: Mm-hmm.
Jon Tota: And so they do some weird things in between sentences and things like that, that you can watch for.
And on the uncanny valley, we, the other technology we work with quite a bit are photo avatars. And those are amazing because there’s very little barrier of entry. You can take a photo of yourself on your phone, upload it into our application, and then begin animating that to your voice.
And those are kind of, it’s still a little bit in the uncanny valley because the body doesn’t move at all. Just the head. but it’s completely AI gen, it’s completely AI generated and it’s really amazing when you think about it cuz it begins blinking its eyes and moving its head and it’s doing that off of a photo.
So those are probably still very much, if you saw one, you’d go, that’s really creepy and not something I’d ever, I’d ever want to use. But I think you give it a year and you’ll be able to take a photo of yourself and animate that as well as we we do with full video footage today.
Helen Todd: So you were talking about the, the photos coming to life. When you, when you mentioned about the photos coming to life, it really made me think of MyHeritage. So if any has seen that campaign, it’s a website where it kind of animates old photos and there were a lot of really interesting reactions. That’s kind of the same type of technology you’re talking about, right, John?
Jon Tota: Yeah, eactly. That, that’s the exact same type of technology. And, you know, there’s a lot of interesting use cases. I think the one you’re talking about obviously is kind of, it’s kind of creepy in a way. Taking an a, an old photo,of an ancestor and animating them to a voice message. I think there’s some interesting use cases and we, we are working with one client now that is a museum and that’s a really interesting use case where, you know, you could.
Really have the photography or the portraits in a museum, an online exhibit, actually come to life and speak to people. And it’s more of a novelty, right? Like, I think that’s cool. It’s, maybe a way to get kids more engaged in a museum presentation making something seem a little bit more lifelike when you’re watching it online, but it’s still at the point where you couldn’t effectively convince people that it’s real.
So you should kind of lean into the uncanny valley aspect of it.
But yeah, so it’s, it’s cool. I think you just see it, and I think it’s a really good example of how far AI can go. Because it, what we create with, you know, it essentially is probably about 15 minutes of video footage, used for training, you know, a video avatar, that’s amazing how, how well that comes out or how realistic it comes out. But you’re, you’re putting a lot of data in to train it.
When you take just a simple photo and if you saw the, the MyAncestor thing, or you know, any of the ones that we do, you know, that’s arguably better AI technology because it doesn’t know what my face looks like when I turn it left to right or when my eyes blink or when my mouth opens.
So it’s a little more experimental, but you can see that if it can make your eyes blink and your head move convincingly, well, why couldn’t your body start moving next? And then it start becoming a full lifelike representation of you.
So I think when you see that,we have that integrated into our application,with DID which is our, one of our other partners, and they’re fantastic. I think people don’t use it yet too much. I think they’re still trying to figure out where it fits in their model.
But I look at it and I think where they’re going with that technology and how it can just kind of open the door to this to so many more people because it’ll, you can just take a photo with your phone and animate it.
I think that’s, if you talk about the future of, at least our corner of this AI space, it’s seeing how far people can take a still photo and turn it into a photorealistic animated character, which would, which is gonna be pretty amazing to see, I think in maybe a year from now, that’s gonna advance quite a bit.
Helen Todd: Oh yeah. In between like the live photos and the masks that all these platforms offer from, you know, snap masks and Instagram and TikTok. There’s already kind of this AR blending of reality and then paired with this AI animating the photos themselves. Like it, I think it’s gonna open up all types of things.
And actually, I mean, the metaverse, you know, is kind of dead in the water. Not completely. But one of the cool things, in one of the videos or little ad commercials that Meta put out was more of a museum aspect where not only were they coming to life, but in a metaverse application actually being inside the painting and interacting with a painting in a museum that comes alive.
So I think that’s still a possibility down the line too. That could be really interesting to explore.
Jon Tota: Yeah, you know, and I think, early on when we started doing this, and now this probably goes back 18 months or so, we had part of our roadmap was like, okay, well then when do you take these avatars to the metaverse?
And like, that was all anyone talked about. And if you didn’t have a metaverse, you know, play to your model in, in the avatar world, it was like, what were you doing?
So we had it kind of in the back of our mind that these, I guess avatar version of yourself, and when I, I guess I try to stay away from the term avatar cuz people automatically think of a video game avatar or what you saw in, like Mark Zuckerberg’s avatar for, you know, those commercials and promotions.
They’re cartoon characters and they’re not perfect replicas of you and everybody kind of looks the same. But I love the idea that in my metaverse, and whether it’s, you know, with Meta or Decentraland or Sandbox or any one of them that why wouldn’t you have one photorealistic representation of yourself that can go from one platform to the next?
Why should I have an avatar that is, you know, kind of predetermined by the metaverse world that you’re in. It should float with you almost like a wallet from one to the next. And we really loved that concept. It was where we wanted to head. And then the metaverse kind of deflated a little bit. And we’re also a business and, and we had, I had a friend of mine who told me, he’s like, well, why would you waste your time on avatars like this that exist in like a 2D world? Like we don’t need them. Like, you’re here. We can do this through Zoom.
And, and a lot of people like that were saying, well just, you know, jump right forward to the metaverse, but you know, let metaverse play. I think it’s super cool, but it’s five or 10 years down the road and if you’re building a business around this, for us it was like, okay, let’s give people ways that they can use avatars and replicas of themselves in the 2D world, like in their emails and on their website or in blogs or on social media, solve problems at that basic level. And then hopefully, you know, somewhere down the road you can do more with it cuz I think Decentraland had a cool museum exhibit like that, and I love that the, you know, the picture could start talking to you on the wall and you’re, you know, you’re engaging with people like it’s in the real world.
I just, I think we’re pretty far from that still. Right? It’s like, it’s a cool future, but, yeah, for us, we’re kind of living in this present day and, and that’s one of the challenges is like trying to create solutions that add value, that solve problems using some, you know, somewhat future state technology and then helping people understand like why they need that and why it could add value, before we get to, you know, fully floating around in a metaverse, right?
Helen Todd: Yeah. Well, I love it because you’re kind of setting yourself up and your business to be very relevant for whenever the metaverse is, ready for mass adoption and making so much value for right now as we’re all kind of, learning and experimenting with these tools.
And I think this is a great segue to talk about, well, if the Metaverse isn’t on your roadmap anymore as it was maybe last year, I feel like everyone, might be switching that up too.
Let’s talk about what is on your roadmap, because the technology is moving like so fast and at breakneck speed. So how are you thinking about, you know, looking down the pipeline on your roadmap?
Jon Tota: So, yeah. So as we’re talking now in April 2023. I think, you know, ChatGPT took over this world, took it, you know, like really kind of caught the general consciousness about, around AI. And I think we had been playing around with GPT3 for about a year before that. You know, and it had been available, you know, to developers and we’d been looking at the integration of it and what that could add to an application like ours.
And, you know, it was kind of along those same lines. Like I was always drawn to the metaverse for avatars because this idea that t’s a virtual world and it’s my avatar is more appropriate in that world, that virtual world than I am in real life. And that, that was kind of the whole argument of why do you need an avatar in this world?
It’s more appropriate in that virtual setting. And, and so that, that was always like the back of our minds. And then it’s like, okay, well what’s the purpose of avatars? Right? Like, if they’re in this world, how can they be better? How can they save more time for people?
And then, we’d always looked at GPT, I don’t think as clearly as we saw it after ChatGPT was launched. But we were amazed by it because avatars require one thing to perform, which is a script. And so like our, we tend to work with people who wing it off the cuff. They come into the studio and they’re like, I’m a professional speaker. I do this all the time, I don’t need a script.
And then we also work with people who are academics and they have tremendous amounts of knowledge that what they deliver in that video is so valuable, yet they hate being on camera. They love writing, but they don’t wanna necessarily perform that. And we kind of live in that, you know, between those two worlds.
So we’ve always worked off of teleprompters in the studio. We loved like focusing with people on the script, getting it perfect. And then when you get in the studio, you’re really just performing to that script. And so that I think always made us, and I started as a screenwriter, so I’ll, I admittedly, I believe more in scripting than than other people.
So that always drew us to this idea that our process doesn’t start with the avatar, it starts with the script. Because an avatar doesn’t have any purpose in our world without a script. If you, you know, we do have a voice recorder in our system because we work with professional speakers who don’t like to script.
And we’re moving to a world where people don’t want to use their keyboard for much of anything. So you can hit record, record your own voice, and your avatar will perform to that. But we looked at the text-to-speech area of AI as being the most powerful, most time saving. And then, and we always looked at GPT, like, wow, where does this fit into the model?
And, my, you know, so I have two partners at Render, my co-founder, Moki Goyal, who’s our Chief Product Officer, and then our Chief Experience officer, our other partner, Jill Schiefelbein. And Jill really focuses with people on how do you use this technology? How do you get the most value out of it?
And Moki’s, you know, the one developing all the technology. And he was our head of product development at my previous software company, so we’ve worked together for a long time. And so he really does much more of all of this than I do. But he’d been in GPT for so long and like.
But I think us, like everybody else, until that ChatGPT launched, all of a sudden you saw it and you’re like, oh, this is the way you use it.
And, and you know, so for your audience, right, like people who don’t know all of this stuff too much. I always refer to GPT cuz the language model, the large language model that powers it all, provided by OpenAI, and developed and maintained by them is called GPT, GPT33 and now GPT4. ChatGPT, which is everybody always just refers to as ChatGPT, that’s an application they built out of probably 20 different applications that they could have built and they probably are building and releasing soon.
But it was this really brilliant way to bring the value of AI and large language models to a mass of users through chatbots. Like chatbot relies on the spoken word or the written word, it allows you to interact and engage with it, which is something so unusual and different than what you can do on the internet today.
And so it was this perfect. I’m sure they had lots of opportunities and ideas of what would be the first way they would show this to the public. But chatbots are really a brilliant way to do it.
And so we saw it as soon as it rolled out and we’d already been, you know, pretty far along in our integration work. And then we looked at and, and like light bulbs all got went off. For people like us, like all application developers because we’re looking at that and saying, oh my, like that would take us in traditional application development nine months with a really robust development team, nine to 12 months to add that type of functionality, even with like algorithms and knowledge bases and not artificial intelligence, it would take you that long just to add that type of interactive capability to your application.
So within, you know, a few months we were able to integrate GPT, GPT3 now, and soon it’ll be, it’ll be GPT4 right into our scripting process. And for us, it’s like the coolest thing.
And being a writer myself, it’s this idea of having, you know, essentially like a virtual writing partner, you know, like you can, and the art, I think, as everybody talks about with GPT, Bard and, you know, Amazon just announced that their tool set, oh, Amazon’s stuff is amazing, what’ll be coming to AWS
Helen Todd: Oh, I haven’t played with that yet…I’ll have to play with that after this.
Jon Tota: Yeah. It’s like a whole found, they’re not even released yet. It’s like a, like a foundational tool set through AWS that you’ll be able to build applications with. And so it’s just amazing when you look at all of them.
But this idea that for us scripting is so important to the process of using an avatar that now you’re not scripting alone, you’re scripting with a virtual writing partner that can help you through it. And with all of them, it comes back to prompting.
As you know, from playing around in the space, the, your ability to prompt, and to teach people how to prompt in better ways, it’s just limitless the results you get back from it. So from a writing perspective, it’s like one of the coolest things.
Helen Todd: You know, that blank page is a real struggle that a lot of creatives, deal with that I think I shared with you, my first demo was to take this mini-series and script idea that I have, and my friend who showed me the demo, and that aha moment happened for me was setting up the scene with a very basic prompt, and within seconds it had the characters, dialogue, and what they did in that scene.
And I was like, holy moly, this makes, you know, writing this script for this mini-series much more doable and within reach, which it still might happen, but I’m launching this podcast instead.
Jon Tota: Yeah. Yeah. It’s an interesting point because I think for people it’s good to understand like a large language model that has been trained off of content on the internet, right?
Like, these are the, there’s certain things it can do better than other things. And when you talk about writing for film and television. Like for me, that’s always been a passion of mine. So I’m like, you really interested in how well it can write in that format? Cause I remember when I started in screenwriting, I was working with a producer and he said, I’ll save you a lot of money.
You don’t need to go to film school be to be a screenwriter. He said, you should go to film school if you wanna be a director, but if you wanna be a writer, either you can write a good story or you can’t. You’re not gonna learn that in film school. What you’ll learn is the structure. And how to craft a script so that it can be, you know, converted into, you know, film or television.
And he said, and I’m gonna give you three scripts, and if you can’t teach yourself that, then you should quit because you’re never gonna learn. Now I’m not a screenwriter today, so that’s not why I quit. But, but it’s
Helen Todd: I’m not sure about this advice as a teacher, like instilling like these either like all or nothing. I don’t know about that
Jon Tota: Tough love. It was tough love, tough love. And but the idea that on the internet there’s a lot of content around scripts and film and video. And so what you noticed, and I saw that early on too, is that models like GPT are innately really good at writing for film and television almost scarily so.
When you do that for a living, living in some cases, but yeah, it’s interesting. It’s kind of one of its natural abilities because there’s so much of that content out on the internet.
Helen Todd: Yeah. That is fascinating. Well, and you mentioned the importance of prompts, and I know, every marketer who’s embraced ChatGPT and these AI tools have already put together like prompt lists. I think every newsletter is offering free prompts and whatnot.
Since you’ve been in the space a little bit longer than most of these, new AI experts popping up left and right, how do you think about prompts for the work that you’re doing when you bring people in and help on the scripting side?
Is there a specific approach that you have or some insider tricks of the trade that you can share with us on how you think about prompts?
Jon Tota: Yeah. And so I guess my perspective on prompting, and all of these AI services probably comes from the perspective of someone who builds software products for a living.
Because I think what you will see quickly and in like when we talk about the AWS services that are coming, those are designed specifically for developers like us to build better applications more quickly. I think what you’ll see probably when you look back at this in a couple years, the biggest impact is that, from a an application development perspective, we, it has leveled the playing field.
Like we can build with a small development team, we can build a functional application that can do things that would’ve taken us years to develop with, you know, a team five times the size, because of prompting.
So when I think of prompts, I think of it from an application developer perspective that we would’ve had, so, okay, take this for an example. Like this is a real example in our application. You take a script from, a video, a transcript that you’ve done and you want to, and it, and it’s a three minute video and you wanna put a short highlight clip of that video out on social media, but you only want it to be 30 seconds.
Any, a good writer will have a very hard time doing that. You can take that script and drop it into our scripting engine and we have a prompt, a preset prompt, make this 30 seconds and that will use GPT’s power to rewrite that script to exactly 30 seconds.
Like that function, I love to use that as an example because I know as a writer that that would take me all day and I would fight with myself on what lines I would actually agree to cut outta that, right? Cuz we love all our words so much.
But that is a perfect use of the technology and kind of illustrates how prompting is so valuable to an application developer. Now, maybe not in all applications, but what, what our Render application is, is an online video creator, so perfect application for what GPT can do.
We start with a script that is too long and that script is written by me with a certain perspective on culture, society, and life. And it’s very serious in nature. I can take that script, maybe it was written for a blog post or something like that. I can drop it into a scripting engine and through pre-programmed prompts, I can make it 30 seconds long, make it appeal to a millennial and make it more funny.
And I would probably, me as a person, probably never be able to get my script in that situation and GPT can do it in a matter of seconds. So that’s just kind of shows the power of prompting out, if I don’t even know that there’s a way we could ever develop technology to do that without AI. But even in traditional application development, it would take so long to build that functionality.
We, that is just interface development essentially for programmers now because you’re using this massive amount of power through prompting.
And so I think to just kind of sum it up, the challenge for all of us is that the large language models that take billions of dollars to train and maintain are not gonna be built by companies like us.
It’s gonna be Amazon and it’s gonna be Microsoft and Google and Meta, but it’s so much content that if you don’t build really effective prompting into your application, it’s just too much. Like we integrated, the first iteration of putting GPT into our application was just like, hey, you can script with GPT, click the spot and then it, you’re still looking at a blank screen because you don’t even know what to prompt.
And so it’s this idea that I think what kind of falls back on developers like us is, if you can harness that capability of understanding the right prompts for your audience, your users, and don’t make them think about them, but program that into your application. And now it’s like, hey, do the thinking for them, come up with the prompts.
Cause that’s the only thing we have to do. The AI model does all the heavy lifting and now you’ve just moved them more quickly through a process by them not having to even consider the, you know, the prompts that are relevant to our users are like one tiny little percent of all the things you could prompt GPT to do.
So it’s kind of our responsibility as application developers to figure out, understand what your users need. And then now instead of thinking of features, I’m kind of thinking of what different language models, AI services do we want to integrate? And how do we wanna guide our users to prompt them in the right way?
And that’s kind of the way I look at that today.
Helen Todd: Yeah, that’s fascinating. It’s like instead of everyone having to be prompt engineers, you’re like the prompt engineer concierge, like through your software of, we’ll take care of this part for you so you can concentrate on the ideas more so.
Yeah. And, when you were speaking, I, was it Thomas Jefferson? Someone has a famous quote where he wrote a letter and was like, if I had more time, this would be shorter.
Jon Tota: Yeah.
Helen Todd: Because sometimes, the brevity is the hard part for, for writers, in that regard.
One thing that you also, have said a few different times throughout our conversation is, Just how much more content everyone’s gonna be creating, from more scripts being created, from the digital replicas, being able to create way more content from one person.
And it seems like there’s just gonna be a flood. I mean, there’s already so much content that exists and you know, as a social media marketer, it’s like we’re always feeding the content beast. Have you thought about at all, like if one person or one company can now do 10 x a hundred x the amount of content creation and generation, what that means for the discoverability and how people are gonna find this content?
Is that all on the shoulders of the algorithms to help surface the top content or the need to build direct lines of communications with your community to feed them the content? But I was thinking about that after our conversation of like, what, what does this mean when everyone’s creating 10 times more content than we’re already creating?
Jon Tota: So I think an interesting point on that is that every problem that AI will create, it will also solve. So there’s gonna be an AI service on one side that’s helping you create too much content than anyone could ever use, and then an AI service on the other side filtering all that content out, right. So we’ll probably end up net net in the same place we are today. Right.
Helen Todd: That’s funny. Well, and you know, as I’m sure that you find, when you tell people about the space that you’re in, that a lot of people have a lot of fear around these tools.what are some of the reactions that you’ve gotten or some concerns that you’ve either heard or actually have, in this, in addition to kind of what we already talked about with the validation of, you know, whose voice, IP is it, and if it’s being deep faked or actually through consent, through the artist?
Jon Tota: So yeah, this kind of brings us into a whole nother piece of the conversation because now. You know, we’re talking about what we do, which is very unique, I think, because we pair, you know, your digital likeness, which is, you know, a photo realistic version of you with a voice clone, which sounds like you.
And now we’re helping you script, and ultimately, ultimately edit that video using AI services. So if you wanna talk about like, potentially the most inauthentic communication ever, right? That’s, that’s our baseline.
So now I’ve showed you the negative side of what we do, if you want objections. So the, the challenge then is helping people understand, one, what we’ve already been dealing with is how do you use that avatar in a way that is authentic to you? But now, just within, you know, this, these last few months when we launched our GPT integration and now, we have a DALL-E integration.
So now you’re gonna be able to get automated, b-roll footage right into your video. And you’re really depending on AI to connect the dots in a lot of ways for you. I still come back to the script because I don’t believe that it’s certainly not today, I don’t think that AI can capture your voice perfectly.
And maybe not in the future, but you know, probably better, in the future. Cause I think we’ll all have personal language models, right? Like that’s the future. If you listen to, I listened to such an amazing interview,with the Google CEO on Hard Fork, another, you know, podcast, New York Times podcast.
And it’s so interesting cause you see how Google’s looking at it like they have so much data on you already that there’s a very clear path for them to have every individual Google Workspace user has their own language model. And now I don’t have to return emails to people. It knows the way I would normally send those emails, and that’s amazingly efficient, but again, probably not the most authentic thing in the world, but it’s coming and it’s coming really fast.
And so, we look at it and, and at least where we’re talking about scripting, it’s like, that’s the one thing that you own. And don’t think that AI is gonna be able to effectively do the job for you, but as creatives, can it help me not stare at the blank page like you talked about earlier? Start with some topics and so again, prompting, right?
Our first step is give me some topics to write about on this subject and then it gets you started. Give me some takeaways you know, make this, you know, give me the visuals, the screen direction for a video. That stuff is just natural, but it’s still your story in the middle. It’s kind of that same thing.
My, I guess my bad mentor at the time, said that, you know, AI will create, will, will eliminate the craft side of writing, at least writing video for, you know, online purposes like we do. There’s a craft to it. There is a way to write an effective script to be delivered in an online video. And it is not the same way you would script for television or film.
And we’re not experts in that space. So my prompting capabilities, at least from an idea perspective, are not gonna be super valuable to a Hollywood screenwriter. But they’ll be really valuable to a professor at a university who wants to put an online course out there and have it sell itself.
Yeah, there’s a certain way that you script your content, it’s still your content, but if we do our job accessing the power of AI through prompting, we can help you make that script as effective as possible for this particular medium.
And it can add background music and logo animations, and B-roll all through AI and you didn’t need to waste any time on that. And then for us, the end game is distribution cuz that’s really our background with our first product Knowledgelink was really about distributing online video. We did it for training purposes at the time, but this is no different.
We’re distributing online video, but now for online video creators and a large portion of that goes to YouTube, or social media and, and things like that. So our first integration that we’re super excited about from a distribution perspective, that we’re rolling out later this month is distributing your synthetic media productions out to YouTube.
Because through that integration we can get performance data back from YouTube. And because everything is AI generated and using your avatar and not you, we can give you recommendations on how to make that video perform better on YouTube based on the data we collect. And we can also have the AI tools rewrite your script so it’s more effective.
And now you didn’t need to do anything. You don’t have to go back in the studio. You don’t need to rewrite it. It just gives you a recommendation on how to make this perform better and then push a button and it’s done. So to me, I think those efficiencies are what we’re after, and hopefully that’s what you lean on AI for and you don’t lean on it to essentially plagiarize your way to a, you know, a first draft because it, it’s, it could very easily help you do that too, if you use it the wrong way.
Helen Todd: Yeah, it’s still, you need someone’s mind and ideas at the core of it and just using these tools for efficiency and expanding, you know, what they’re creating. There was, are you familiar with Esther Perel? She’s like this amazing relationship therapist and she spoke at South by Southwest.
And one of the things that she mentioned, which was really interesting, but it kind of highlights an, an odd application use, but one of these application uses of someone who just has, you know, she’s got such an amazing spirit and I love her and all of her content, but someone who was able to train an AI scraped her content online and created an Esther Perel chatbot without her permission.
But so he could kind of engage with Esther Perel as a therapist with this second self of her. And, you know, found it very therapeutic. Now, it wasn’t 100% accurate. It wasn’t 100% her and not done with her permission, but she found it really fascinating that someone would do that.
And I can see that application of just wanting to anyone, and I, I love the guys at Hard Fork, I’m a big fan of their podcast. Yeah. Of like, when you hear these things, it’s like, well, what, what does Casey Newton think about this? You know, and, and that these use cases of these people, it’s, kind of, a different form to get their thoughts out through these tools.
I don’t know. Does that make sense?
Jon Tota: Yeah. And I think you highlight one of the real dangers of what you can do with AI because we’ve all leaned into content marketing and content distribution on all these free services host on the internet.
And if you had to like paint a really scary sci-fi, you know, picture or write a scary sci-fi movie, you’d say like, let’s create a technology that gets everybody to put everything about their, you know, their thoughts, their ideas, their personal feelings and stories put it all out there for everyone to see, and then create a robot that can regurgitate that and replicate you without you being there. Right?
Like, so social media and the internet got all of this content out of our heads and onto public sources because it was so easy to do it on YouTube, and it’s so easy to do blog posts on your website and it’s effective and it worked.
But now, in Esther Perel’s a perfect example that allows a large language model to crawl the internet. Particularly people who are thought leaders and educators who put a ton of free content out there on the internet. It gets indexed just like everything else on the internet. But in this case, now it learns and it can replicate a model of you.
And so, yes, super like really amazing technology and you can think of all the great things it could do, but I think intellectual property really becomes compromised in almost every way. But then again, like you put it out on the internet for public use, so you can’t blame AI for aggregating all the stuff that you put online and then teaching itself how to be you in place of you.
I mean, that’s just like, you know, that’s the side effect of putting, you know, lots of great content online. And so, yeah, I think that’s a perfect example of it. I also think it’s a problem for really well known, successful thought leaders who have lots of content online. And if you’re getting, people are making chatbots of you, you probably did pretty well in your career, so.
Helen Todd: Since you did bring up kind of some of the IP issues, just wanna be super clear that I support consent for anyone artists that trains the data sets. And that there’s models out there where they’re getting compensation and whatnot. And this has kind of come up a little bit more in the art community, so that their art isn’t, you know, it’s officially new art using these tools, but in the likeness of their style of art.
So I just wanna be super clear about that too, where we stand, and I think that Esther Perel is an interesting application, not done with her consent and done ethically, but shows an interesting use case, so.
Jon Tota: Yeah, I think that’s what you’re gonna see more and more discussion around because. All of this technology has made it really easy for anyone with very little programming skill To do things like that.
And then it’s like, okay, it’s a cool novelty. It’s a parlor trick. Right. And did it hurt anybody? No, she didn’t mind. She probably took it as a really interesting experiment.
Again, it wasn’t doing anything that you couldn’t piece together by looking at the internet yourself. I think the real danger becomes when people are trying to pass that off as if it came from you.
And it’s like taking in iterative work of things that you put out on the internet and now repackaging it and saying, hey, this is, you know, you know, Esther Perel 2.0 like that, that’s a problem. And I don’t know that there’s a technical fix to that.
I’m sure there’s a technical fix that big companies like Google could probably allow you to protect content that you have out online. May maybe keep it outside of large learning models or large language models.
But yeah, I guess we’ll see what happens with that. And you, you know, and there’s this whole talk, right? Like, and I think this is why the open letter to, you know, slow down development on AI models, don’t go past GPT4, take a break and see where this goes. And I think there’s obviously a lot of discussion around it, right? And like, maybe it makes sense.
But I also think it’s kind of funny when you think about it cuz you don’t just stop. An AI model continues to train and we continue to put more content out on the internet and it learns more and you can’t stop the entire industry and as soon as you tell, you know, one large company to stop, iun secret, their competitors are gonna be pushing forward. Right?
Helen Todd: Oh yeah, and like outside of the United States too, where it’s more even like loosey-goosey in some countries. Yeah. So yeah, and there’s some companies like Adobe, I think they’ve got a really thoughtful, probably one of the most ethical approaches so far in terms of their content verification that’s kind of somehow, encrypted in the metadata and even if it’s content screenshot, can be verified based on image recognition against their database in the cloud.
So, you know, I think, something that you said earlier in terms of AI will come up with solutions for all these problems that for every problem that comes up, you know, there’s some technological solution that others will, will come up with too.
Jon Tota: Yeah, yeah. And Adobe, I think you hear so much about the, the large language models and things like ChatGPT, but if you’re in the production space and you’ve used Adobe recently, they keep introducing new AI tools and capabilities that from, you know, it’s someone who does post-production for a living, just speeding that process up, making things faster, being able to take out background visuals and background noises and that, that kind of stuff I think always adds value.
And it’s not about like faking people and doing things like that, but it’s about just taking something that, you know, I’ve run a production business for, you know, probably close to 20 years in one form or another.
And we don’t get large budgets for what you would get for, you know, a film or a television commercial or something that a large agency gets because we’re developing internal training, communications, things of that nature that you have to be very efficient. And as a production company, your margins are already so slim that if you can now look at AI tools and say, hey, this just gives us an edge.
Like we can just do things more quickly with less subjectivity and deliver a better product to our customers. I feel like what Adobe’s done specifically with AI capabilities, you’ve given a shot to production companies on, you know, kind of the small, like the lower end of the industry. you give ’em a shot to actually build a business that can, that can really make a profit in production.
And so yeah, like we look at that and we think every time we look at our application we’re like, okay, how would our own editors use this? And how would this make their life easier or faster, or eliminate a video that they would normally have to manually produce?
And I think, yeah, just speaking of Adobe, I think they’ve done a really cool job of that. And that’s the other side of AI. It’s not sexy and nobody talks about it. But if you look at products like that where they put, you know, just capabilities into an application already that we would’ve never had the ability to use, it helps businesses.
I think it’s gonna have a huge impact on video production. Not just in negative ways, which a lot of people focus on, but in the fact that there’s a lot of small businesses out there that produce media for a living, and they might have a shot to actually really make money doing it with all of these AI capabilities. So I think that’s a really cool aspect of it too.
Helen Todd: Yeah. Well, I mean, I’ve said this I think before, and another one of these conversations that smartphones and the iPhone turned everyone into photographers and these tools, I mean, we’re all kind of video editors with the content that’s needed for the internet these days. But these tools are kind of like democratizing creativity and giving everyone access to a playing field that they might not have otherwise had it.
And it sounds like that’s true in your case.
Jon Tota: Yeah. Yeah. And I believe from a product development, you know, a software developer perspective, they’re gonna do the same thing for our industry. You know, like I see a future where you will be able to sketch, I mean, they saw the presentation. It’s the most amazing thing ever.
Sketch out a hand drawn wireframe of a website, show it to GPT4, and it can code that website. I mean, if you’re, an engineer, a software engineer for living, it’s gotta make you look at that, that space three years from now and think, what should I be focusing on? Like, where do I go with my career?
Because that was, that’s like one of the more amazing things I’ve ever seen is that like I am a business analyst at heart from an application perspective where I can’t code an application, but I work very closely with Moki on the functionality and the design and wireframing and work flowing it out.
And then you lose control over your application, give it to your engineers and you hope you get like 50% of what you dreamed about. But now there’s a real future where you could democratize application development. And I think,you know, it’s not different than what Amazon did with AWS, you know.
Our application, Knowledgelink at the time, we switched to AWS pretty early on, like probably 2008, 2009. I think it came out maybe a couple years before that. But nobody wanted to go to a cloud-based application. But as a small, small business scaling its own software product without financing, or not a lot of financing, we really had to be efficient and figure out ways that we could do things.
And having a dedicated server for every one of our clients. And then, figuring out a great application feature and putting it into one application, but then having to recode 10 other ones on different servers, you could never make money. It’s, it was just an inefficient model.
And when we saw like AWS and cloud computing, it was a really hard sell. Probably just as hard as selling streaming video back in 2004 to get a large enterprise to move to that. But we looked at it and we thought, we can cut so much expense. We can develop faster and bring features to everyone at the same time.
And if you look at it today, you would never build an application any other way. And so adding AI capabilities to AWS, which is what Amazon just announced and what they’re doing, like I look at that from an application perspective and think anybody with a good idea can build a software application now delivered online and charge people to use it. It’s like one of the coolest things ever.
You don’t need to be an engineer, you know, there’s probably some hurdles, new hurdles that’ll come and there’ll be new, new paths for people. But I do think some of those things, just removing that barrier of entry to certain creative pursuits, I think, I think it’s one of the great benefits of AI.
Helen Todd: Yeah. I couldn’t agree more. Well, I feel like this is a great segue to talk about predictions or is there anything, I feel like we’ve kind of covered a few, but do you have any other predictions or anything else that makes you really excited about AI and creativity in the space?
Jon Tota: You know, I think, probably what’s most exciting right now, is similar to like when we got into streaming video you know, like we like pitched our application to everyone. There was always someone at the table banging the table with their DVDs and like, you guys are crazy. Like, it’s never gonna play. And it would rebuffer and jam up and you, you didn’t have enough internet to stream video back in 2002, 2003.
And then YouTube showed up on the consumer, in the consumer space and took over the general consciousness about streaming video and what’s acceptable and what you can do and what felt like overnight. But it was probably another couple years. Everything we were doing was acceptable and people were, you know, interested in trying it, even though it might have still been a little cutting edge.
And I see some of the things that we’ve been doing with synthetic media for the last few years, which is AI-driven, but not fully enhanced by AI to the point that we are today. But people are always looking at it a little skeptical, and they weren’t sure whether they could trust it or not. And then, you know, ChatGPT shows up and all of a sudden everything AI people are, maybe they’re still a little skeptical, but they understand what it is, they believe in it, they’re willing to give it a try.
And so I think what probably excites me the most about this is that you, if you’re an early adopter or an innovator in some of these technologies, which I think we like to be, you bang your head against the wall and you kind of yelling at people trying to get people’s attention that like, this could be something you could actually use.
And then when one of the big players shows up and captures the consumer audience, their attention, it like opens up all these doors for you. Everybody who wouldn’t take a meeting will consider it.
And so I think that’s cool from that’s cool from the AI perspective. Synthetic media and the avatars, oh man, they just keep getting more and more lifelike and, you know, for us we’ve kind of changed the style.
Like you write your script and then you cast it with avatars, either your own or one of the things that I think is so cool, you can dynamically create like AI generated cast members to put into to your script.
And so I am not suggesting that those AI generated cast members can do what you would want in a television show or a film, but you see the capability that you could have extras in a film completely generated by AI.
And so I think what you see from something like Stability right now, and certainly DALL-E, the dynamic generation of visuals at least in the world that we live in, I can do only so much with the avatar you created. But when I could start dynamically generating and adding visuals around it and music and things like that, it just makes it a better video makes it more interesting. So I’m really excited.
I, and I think one of the things that maybe people don’t know unless you’re a developer in this space, one of the cool things about AI is that there’s almost always an open API to, and I shouldn’t say open API like the company, right? Like there is an accessible API to every one of these services.
And so it’s kind of one of these industries that has gone in with this idea that we’re gonna have a front end interface, but we’re also gonna open up our API so that people like us can pull multiple different AI services into an application. And that, that aggregate model that you can create is like so amazing.
And I look at the future, at least the near future, cause I wouldn’t predict beyond this year with how fast it’s going. But I think that’s what sets this industry up for being just a really innovative space because it opens up the door to so many people to say, hey, I don’t need to figure out what Stability is already figured out, I just need to decide on how that integrates into my user experience, how I add value to the process, and now I can do something that would’ve taken me probably years to develop and you can integrate it into your process like that.
So yeah. That’s exciting. It’s a cool space because it’s very interoperable right now and so I love that aspect of it.
And yeah, beyond that, I guess your guess is as good of my, as good as mine.
Helen Todd: Well, Jon, thank you so much for your time today. I loved hearing all about this and we’ll be sure to link everything in the show notes. Be sure to sign up for the newsletter so you, so you can learn when, we’ll do the content capture event in Cincinnati and definitely follow Jon and his company so you can learn about his content capture tour if this is something that’s interested or that you’re interested in.
And Jon, I’m so excited to see what else, and how your roadmap keeps changing and the next time that we connect, and see where the technology is from there. So thank you so much for your time today.
Thank you for spending some time with us today. We’re just getting started and would love your support. Subscribe to Creativity Squared on your preferred podcast platform and leave a review. It really helps and I’d love to hear your feedback. What topics are you thinking about and want to dive into more?
I invite you to visit creativitysquared.com to let me know. And while you’re there, be sure to sign up for our free weekly newsletter so you can easily stay on top of all the latest news at the intersection of AI and creativity.
Because it’s so important to support artists, 10% of all revenue, Creativity Squared generates will go to Arts Wave, a nationally recognized nonprofit that supports over a hundred arts organizations. Become a premium newsletter subscriber, or leave a tip on the website to support this project and Arts Wave and premium newsletter subscribers will receive NFTs of episode cover art, and more extras to say thank you for helping bring my dream to life.
And a big, big thank you to everyone who’s offered their time, energy, and encouragement and support so far. I really appreciate it from the bottom of my heart.
This show is produced and made possible by the team at Play Audio Agency. Until next week, keep creating
Theme: Just a dream, dream, AI, AI, AI