“If AI is reaching the point where it will be virtually impossible to detect audio and video representations of people saying things they never said …, seeing will no longer be believing.”

— Brookings Institution, 2020

“Deepfakes have allowed people to claim that video evidence that would otherwise be very compelling is a fake.”

— Nick Dufour, Google research engineer

“The word deepfake has become a generic noun for the use of machine-learning algorithms and facial-mapping technology to digitally manipulate people’s voices, bodies and faces. And the technology is increasingly so realistic that the deepfakes are almost impossible to detect.”

— Ben Sasse

“The real problem is not whether machines think but whether men do.”

— B. F. Skinner

“Technological progress has merely provided us with more efficient means for going backwards.”

— Aldous Huxley

Morgan Freeman deepfake – first 50 seconds on Diep Nep link below has a very impressive deepfake from Creative Bloq:

“One of the most scarily convincing deepfakes is this Morgan Freeman deepfake. The video was first shared by Dutch deepfake YouTube Channel Diep Nep (link opens in new tab) last year, crediting the concept to Bob de Jong and the (very good) voice acting to Boet Schouwink.”

“The video’s still hugely impressive, and frightening, a year on, as we saw when it resurfaced on Twitter last month. ‘How can this tech NOT be deployed in the 2024 election?’ one user commented. ‘Soon we’ll see that even this is essentially child’s play when it comes to the actual, ever-present (yet invisible) capabilities of identity manipulation and whole-cloth digital identity creation…the implications of which are far-reaching & bone chilling,’ someone added.”

The real Morgan Freeman (I think).
The real Morgan Freeman (I think).

Is your facial image stored somewhere on the web? Today, for virtually all of us, the answer likely is yes. This means that our facial images are available, probably not for good purposes, to be used in creating deepfakes. Another application of AI, which I did not know about until very recently. This is something we all need to know about.

I have certainly run across the term “deepfake” a number of times, but it turns out that I had no real idea of what it actually was. Deepfakes, so I read, are even taking over TikTok – see below. And there is an estimate out that 20% of account takeover attacks this year will use deepfake technology.

Oh yes, and have you ever walked under any of the guesstimated 20 billion or so surveillance cameras that have been installed almost everywhere? These provide a 3D facial image and probably many videos as well. Perhaps deepfakery is their main purpose? Or worse, more likely.

What are “deepfakes”?

Wikipedia seems like a good place to start on answering this basic question:

“Deepfakes (a portmanteau of ‘deep learning’ and ‘fake’) are synthetic media in which a person in an existing image or video is replaced with someone else’s likeness. While the act of creating fake content is not new, deepfakes leverage powerful techniques from machine learning and artificial intelligence to manipulate or generate visual and audio content that can more easily deceive. The main machine learning methods used to create deepfakes are based on deep learning and involve training generative neural network architectures, such as autoencoders, or generative adversarial networks (GANs) [see Related Reading below].

“Deepfakes have garnered widespread attention for their potential use in creating child sexual abuse material, celebrity pornographic videos, revenge porn, fake news, hoaxes, bullying, and financial fraud. This has elicited responses from both industry and government to detect and limit their use.”

“From traditional entertainment to gaming, deepfake technology has evolved to be increasingly convincing and available to the public, allowing the disruption of the entertainment and media industries.”

This sure sounds to me like something we all should really know about. So, here goes …

SentinelOne, a leading cybersecurity provider, provides an example of what it describes as an example of deepfaking:

“Currently, this is more or less limited to ‘face swapping’: placing the head of one or more persons onto other people’s bodies and lip-syncing the desired audio. Nevertheless, the effects can be quite stunning, as seen in this Deepfake of Steve Buscemi faceswapped onto the body of Jennifer Lawrence.”

Steve deepfaking Jennifer is not too convincing to me, but check it out for yourself ... https://youtu.be/r1jng79a5xc
Steve deepfaking Jennifer is not too convincing to me, but check it out for yourself … https://youtu.be/r1jng79a5xc

How are “deepfakes” created?

For those who, like myself, are interested in the technical detail at a high level, here is what Wikipedia says:

Techniques. Deepfakes rely on a type of neural network called an autoencoder. These consist of an encoder, which reduces an image to a lower dimensional latent space, and a decoder, which reconstructs the image from the latent representation. Deepfakes utilize this architecture by having a universal encoder which encodes a person in to the latent space. The latent representation contains key features about their facial features and body posture. This can then be decoded with a model trained specifically for the target. This means the target’s detailed information will be superimposed on the underlying facial and body features of the original video, represented in the latent space.”

“A popular upgrade to this architecture attaches a generative adversarial network to the decoder. A GAN trains a generator, in this case the decoder, and a discriminator in an adversarial relationship. The generator creates new images from the latent representation of the source material, while the discriminator attempts to determine whether or not the image is generated. This causes the generator to create images that mimic reality extremely well as any defects would be caught by the discriminator. Both algorithms improve constantly in a zero sum game. This makes deepfakes difficult to combat as they are constantly evolving; any time a defect is determined, it can be corrected.”

Chiradeep BasuMallick has a somewhat lengthy article on deepfakes in SpiceWorks: “What Is Deepfake? Meaning, Types of Frauds, Examples, and Prevention Best Practices for 2022

“A deepfake is defined as an artificial intelligence-powered form of media that depicts a person saying something they did not say, appearing in a manner different from authentic visuals, or diverging from reality somehow, with the purpose of fooling the media viewer or a technology system.”

A highly-simplified picture of how image encoders, latent forms, and decoders are used, as illustrated above, to swap two images.
A highly-simplified picture of how image encoders, latent forms, and decoders are used, as illustrated above, to swap two images.

How to detect a deepfake according to the WEF

From the World Economic Forum’s website in April 2021 for its Global Technology Governance Summit: “How to tell reality from a deepfake?

“… here are links to further reading from the World Economic Forum’s Strategic Intelligence platform:”

“A growing awareness of deepfakes meant people were quickly able to spot bogus online profiles of ‘Amazon employees’ bashing unions, according to this report – though a hyper-awareness of the technology could also lead people to stop believing in real media. (MIT Technology Review)”

“The systems designed to help us detect deepfakes can be deceived, according to a recently-published study – by inserting ‘adversarial examples’ into every video frame and tripping up machine learning models. (Science Daily)

“Authoritarian regimes can exploit cries of ‘deepfake.’ According to this opinion piece, claims of deepfakery and video manipulation are increasingly being used by the powerful to claim plausible deniability when incriminating footage surfaces. (Wired)”

“It’s easy to blame deepfakes for the proliferation of misinformation, but according to this opinion piece the technology is no more effective than more traditional means of lying creatively – like simply slapping a made-up quote onto someone’s image and sharing it. (NiemanLab)”

“A recently-published study found that one in three Singaporeans aware of deepfakes believe they’ve circulated deepfake content on social media, which they later learned was part of a hoax. (Science Daily)”

Don’t know about you, but these links didn’t really help me much in figuring out whether I might be watching a real person or a deepfake person. It turns out that there is a tech solution, probably one of many out there, that works with a browser:

An example of a browser-based deepfake detector

Via GitHub: “AI DeepFake detection”:

“This is a browser extension to detect deepfakes. A new button is integrated in the YouTube™ player for this purpose.”

“Introduction. Creating deepfake videos is getting easier and easier. You don’t need technical skills anymore to make a manipulated video. You just follow an online instruction, that’s enough. At the same time, the videos are becoming more and more realistic.”

“Humans are far inferior to AI when it comes to recognizing deepfake videos. Therefore, it is now possible to manipulate entire societies, since the naked eye can no longer recognize deepfakes. The question for everyone is how to protect themselves and others from this disinformation.”

“Part of the solution is to critically question and check if what you see is plausible. In addition, however, we need technical tools and have to fight AI with AI.”

Makes sense to me. Not sure however that I want to do a deepfake check like this for every facial image and video clip that I run across. It may be simpler and safer to assume that everything is a deepfake until proven otherwise, which you would do only for something quite important. Like suspected phishing, maybe …

Just what we need – deepfakes for better phishing

Stu Sjouwerman writing in Fast Company in December 2022: “Deepfakes: Get ready for phishing 2.0

“With deepfakes, phishing is evolving once again and being called the most dangerous form of cybercrime.”

“Phishing has been around for at least a few decades. Attackers leverage human psychology, exploit human nature (such as impulsiveness, grievances, and curiosity), and impersonate trusted entities to fool victims into carrying out an action such as clicking a malicious URL, downloading an attachment, transferring funds, or sharing sensitive data.”

“While the most common form of phishing is still via email, we’re seeing an uptick in phishing that combines the use of voice (vishing), social media, and SMS (aka smishing) to appear more believable and trustworthy to victims. With deepfakes, phishing is evolving once again and being called the most dangerous form of cybercrime [emphasis added].”

“What Are Deepfakes? Deepfake technology (or deepfakes) is ‘a kind of artificial intelligence (AI) capable of generating synthetic audio, video, images, and virtual personas,’ says Steve Durbin of the Information Security Forum. Users might already recognize this on mobile phones, with apps that can seemingly bring dead people back to life, swap faces with celebrities, and create hyper-realistic effects like de-aging Hollywood actors.”

“By 2023, 20% of all account takeover attacks will leverage deepfake technology. It’s time organizations recognize this threat and raise employee awareness because synthetic media is here to stay and will certainly become more realistic and widespread.”

Twitter – November 21, 2022: Source: https://coingape.com/ftx-news-sbfs-viral-deepfake-lures-customers-for-refund/
Twitter – November 21, 2022:
Source: https://coingape.com/ftx-news-sbfs-viral-deepfake-lures-customers-for-refund/

Deepfakes are even taking over TikTok and much else

From The Conversation, a news and commentary website, we learn that: “Deepfakes are taking over TikTok — here’s how you can spot them”:

“Although deepfakes are often used creatively or for fun, they’re increasingly being deployed in disinformation campaigns, for identity fraud, and to discredit public figures and celebrities.”

“One of the world’s most popular social media platforms, TikTok, is now host to a steady stream of deepfake videos. Deepfakes are videos in which a subject’s face or body has been digitally altered to make them look like someone else – usually a famous person. One notable example is the @deeptomcriuse TikTok account, which has posted dozens of deepfake videos impersonating Tom Cruise and attracted some 3.6 million followers.”

“But they’re also available for misuse. At the same time, deepfake technology is thought to present several social problems such as:

  • Deepfakes are being used as ‘proof’ for other fake news and disinformation.
  • Deepfakes are being used to discredit celebrities and others whose livelihood depends on sharing content while maintaining a reputation.
  • Difficulties providing verifiable footage for political communication, health messaging, and electoral campaigns.
  • People’s faces are being used in deepfake pornography.”

Identity spoofing seems like a very serious problem

Whenever an individual is able to identify using an online interface, there is an opportunity for identity spoofing. This might occur, for example, in an online job interview. It might happen if the video caller is asking for access to a building or facility and is using a spoofed manager image.

We have become so accustomed to interacting openly with talking heads that show up on our screens that it would almost never occur to anyone that the talking head might be a deepfake. This most likely would be in a one-on-one situation rather than in a multi-faced Zoom-type interaction.

Jack Cook had an article in an American University publication in July 2022 that illustrates the identity spoofing dangers that exist today: “Deepfake Technology: Assessing Security Risk”:

“Imagine scrolling through your favorite social media feed when something catches your eye—a short video clip of a familiar face. Businessman turned celebrity Elon Musk is promoting a new cryptocurrency investment. All you need to do is transfer funds to a crypto wallet and the returns will be guaranteed. After all, you’ve heard stories from friends who have made money from Musk’s other endorsements.”

“This situation occurred recently, and a small number of investors jumped at the opportunity after seeing the interview clip of Elon Musk. Unfortunately for them, the video was not real, it was a deepfake. Deepfakes, fabricated videos which imitate the likeness of an individual, can take on many forms. Often, these include creating an image of a person that does not exist, creating a video of someone saying or doing something they have never done, or synthesizing a person’s voice in an audio file. Although deepfake technology is relatively primitive, bad actors have increasingly used it for malicious purposes. As the technology progresses, people will likely continue to use it for reputation tarnishing, financial gain, and for harming state security. Additionally, academics and policymakers show varying levels of concern for how deepfakes could harm society. It has yet to be seen if social media giants and governments will holistically address the misuse of deepfake technology, however some efforts are underway.”

“… Furthermore, cyber criminals use deepfake technology to conduct online fraud. For example, a recent scheme utilized artificially generated audio to match an energy company CEO’s voice. When the fake ‘CEO’ called an employee to wire money, his slight German accent and voice cadence matched perfectly. The employee wired $243,000 to the cybercriminal before realizing his mistake.”

How can us normal folks spot a deep-faked video call?

Liam Tung, ZDNet Contributing Writer, in August 2022 suggested this simple trick “How to spot a deepfake? One simple trick is all you need”:

“With criminals beginning to use deepfake video technology to spoof an identity in live online job interviews, security researchers have highlighted one simple way to spot a deepfake: just ask the person to turn their face sideways on.”

“The reason for this as a potential handy authentication check is that deepfake AI models, while good at recreating front-on views of a person’s face, aren’t good at doing side-on or profile views like the ones you might see in a mug shot.”

“Metaphysics.ai highlights the instability of recreating full 90° profile views in live deepfake videos, making the side profile check a simple and effective authentication procedure for companies conducting video-based online job interviews.”

“… the Federal Bureau of Investigations warned it had seen an uptick in scammers using deepfake audio and video when participating in online job interviews, which became more widely used in the pandemic. The FBI noted that tech vacancies were targeted by deepfake candidates because the roles would give the attacker access to corporate IT databases, private customer data, and proprietary information. ”

What if AI-driven callers are already wise to this trick? Maybe something like “FakeCatcher” by Intel might work …

Jim Nash reporting in BiometricUpdate.com in November 2022: “A big name enters the battle against deepfake threat”:

“Intel says its new product, FakeCatcher, detects 96 percent of deepfakes. Executives say theirs is the first detector that works in milliseconds. Ilke Demir, a senior staff research scientist in Intel Labs, designed the tool with Umur Ciftci, a research scientist at the State University of New York at Binghamton. It is server-based hardware and software that, it is hoped, can forestall the money businesses, public figures and governments that are expected to have to pay to deal with deepfakes.”

“Intel says the tool is trained to watch for clues of authenticity in real videos. Those clues include the flush of color that washes over human skin with each heartbeat. Imperceptible for the eye, it is easy for an algorithm to see in an ordinary camera. Spatiotemporal maps are made using the phenomenon, and deep learning code makes the call. Intel also applies AI to analyzing facial expressions to help teachers know how well they are getting through to students.”

Yuri Svitlyk  in August 2022 writing in Root Nation has a quite helpful article on: “What is a deepfake, why is it dangerous and how to recognize it”:

“So how do you recognize a deepfake? Here are some things to keep in mind while watching the video:”

“Does the sound keep up with the movements of the mouth? Sometimes they don’t match exactly, and the person in the video moves their lips too late.”

“All kinds of phenomena that seem unnatural. We are talking here, among other things, about the position of the whole body or head in relation to the torso, incorrect reflection of light on objects, incorrect reflection of light in jewelry, etc. An unnatural skin color can also be a sign.”

“Audio and video quality. The difference between them will help to detect deepfake. Usually, the audio track has the worst quality.”

“Image irregularities. More often they appear at the junction of the body and head. When a celebrity’s head is ‘glued’ to another body, blurring can appear in the neck area.”

“Sometimes there are frame gaps and errors (different angles of light, type, or direction).”

“You should also rely on your own feelings. Sometimes we get the impression that something is ‘wrong’. This happens, among other things, when the emotions of the person depicted on the screen do not match those shown by facial expression or tone of voice. This also suggests that the video may have been tampered with.”

Bottom line:

Deepfakes have become an extremely serious problem in a wide variety of applications, particularly those involving criminal activity. Continued technological advances in AI generally and machine learning in particular have made reliable detection of deepfakes increasingly difficult. Tool-based detection schemes can be learned and overcome by deepfake intelligence, making it necessary to rely in many cases on human-based behavioral detection. And much greater user caution when faced with images and videos requesting access to important or valuable information. This suggests that the ultimate solution to deepfakery will be achieved through regular and intensive user training, perhaps backed by evolving detection tools.

Related Reading

Generative Adversarial Networks (GANs).
A key technology leveraged to produce deepfakes and other synthetic media is the concept of a ‘Generative Adversarial Network’ or GAN. In a GAN, two machine learning networks are utilized to develop synthetic content through an adversarial process. The first network is the ‘generator.’ Data that represents the type of content to be created is fed to this first network so that it can ‘learn’ the characteristics of that type of data. The generator then attempts to create new examples of that data which exhibit the same characteristics of the original data. These generated examples are then presented to the second machine learning network, which has also been trained (but through a slightly different approach) to ‘learn’ to identify the characteristics of that type of data. “

“This second network (the ‘adversary’) attempts to detect flaws in the presented examples and rejects those which it determines do not exhibit the same sort of characteristics as the original data – identifying them as ‘fakes.’ These fakes are then ‘returned’ to the first network, so it can learn to improve its process of creating new data.“

“This back and forth continues until the generator produces fake content that the adversary identifies as real. The first practical application of GANs was established by Ian Goodfellow and his coworkers in 2014, when they demonstrated the ability to create synthetic images of human faces. While human faces are a popular subject of GANs, they can be applied to any content. The more detailed (i.e., realistic) the content used to train the networks in a GAN, the more realistic the output will be.”

“In August, Patrick Hillman, chief communications officer of blockchain ecosystem Binance, knew something was off when he was scrolling through his full inbox and found six messages from clients about recent video calls with investors in which he had allegedly participated. ‘Thanks for the investment opportunity,’ one of them said. ‘I have some concerns about your investment advice,’ another wrote. Others complained the video quality wasn’t very good, and one even asked outright: ‘Can you confirm the Zoom call we had on Thursday was you?’”

“With a sinking feeling in his stomach, Hillman realized that someone had deepfaked his image and voice well enough to hold 20-minute ‘investment’ Zoom calls trying to convince his company’s clients to turn over their Bitcoin for scammy investments. ‘The clients I was able to connect with shared with me links to faked LinkedIn and Telegram profiles claiming to be me inviting them to various meetings to talk about different listing opportunities. Then the criminals used a convincing-looking holograph of me in Zoom calls to try and scam several representatives of legitimate cryptocurrency projects,’ he says.”

“As the world’s largest crypto exchange with $25 billion in volume at the time of this writing, Binance deals with its share of fake investment frauds that try to capitalize on its brand and steal people’s crypto.”

“The scam is so novel that if it weren’t for astute investors detecting oddities and latency in the videos Hillman may have never known about these deepfake video calls, despite the company’s heavy investments in security talent and technologies.”

“The backlash was swift. But what’s relatively overlooked is the vast potential to use artistic generative AI in scams. At the far end of the spectrum, there are reports of these tools being able to fake fingerprints and facial scans (the method most of us use to lock our phones).”

“Criminals are quickly finding new ways to use generative AI to improve the frauds they already perpetrate. The lure of generative AI in scams comes from its ability to find patterns in large amounts of data.”

“Cybersecurity has seen a rise in ‘bad bots’: malicious automated programs that mimic human behaviour to conduct crime. Generative AI will make these even more sophisticated and difficult to detect.”

“Ever received a scam text from the ‘tax office’ claiming you had a refund waiting? Or maybe you got a call claiming a warrant was out for your arrest?”

“In such scams, generative AI could be used to improve the quality of the texts or emails, making them much more believable. For example, in recent years we’ve seen AI systems being used to impersonate important figures in ‘voice spoofing’ attacks.

“Then there are romance scams, where criminals pose as romantic interests and ask their targets for money to help them out of financial distress. These scams are already widespread and often lucrative. Training AI on actual messages between intimate partners could help create a scam chatbot that’s indistinguishable from a human.”

“Generative AI could also allow cybercriminals to more selectively target vulnerable people. For instance, training a system on information stolen from major companies, such as in the Optus or Medibank hacks last year, could help criminals target elderly people, people with disabilities, or people in financial hardship.”

“Generative artificial intelligence (AI) describes algorithms (such as ChatGPT) that can be used to create new content, including audio, code, images, text, simulations, and videos. Recent new breakthroughs in the field have the potential to drastically change the way we approach content creation.”

“While many have reacted to ChatGPT (and AI and machine learning more broadly) with fear, machine learning clearly has the potential for good. In the years since its wide deployment, machine learning has demonstrated impact in a number of industries, accomplishing things like medical imaging analysis and high-resolution weather forecasts. A 2022 McKinsey survey shows that AI adoption has more than doubled over the past five years, and investment in AI is increasing apace. It’s clear that generative AI tools like ChatGPT and DALL-E (a tool for AI-generated art) have the potential to change how a range of jobs are performed. The full scope of that impact, though, is still unknown—as are the risks. But there are some questions we can answer—like how generative AI models are built, what kinds of problems they are best suited to solve, and how they fit into the broader category of machine learning.”

  • Deepfake detectors may not be the answer: “A deepfake detection challenge run by Facebook found the best system could only detect a synthetic video 65 per cent of the time.(Facebook).”
Facebook's deepfake detection challenge.
Facebook’s deepfake detection challenge.