<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[CleverPeople.com: All Site Blogs: November 2017}]]></title>
	<link>https://cleverpeople.com/blog/all/1509508800/1512104400</link>
	<atom:link href="https://cleverpeople.com/blog/all/1509508800/1512104400" rel="self" type="application/rss+xml" />
	<description><![CDATA[}]]></description>
		<item>
	<guid isPermaLink="true">https://cleverpeople.com/blog/view/457/using-ai-to-automate-dialogue-animation-of-3d-mesh-character-models</guid>
	<pubDate>Tue, 07 Nov 2017 13:41:31 -0500</pubDate>
	<link>https://cleverpeople.com/blog/view/457/using-ai-to-automate-dialogue-animation-of-3d-mesh-character-models</link>
	<title><![CDATA[Using AI to Automate Dialogue Animation of 3D Mesh Character Models]]></title>
	<description><![CDATA[<p>I believe I&#39;ve developed a process to use <strong>Artificial Intelligence (AI)</strong> to automate the dialogue animation of 3D mesh character models. Let me start with the vision: I want to...</p><ol><li>Record an audio track of character dialogue.</li>
	<li>Analyse the audio track using speech-to-text artificial intelligence.</li>
	<li>Receive speech-to-text results, but with the time offset information for words and phonemes.</li>
	<li>Import those encoded results into a Blender 3D animation timeline.</li>
	<li>Blender uses those results to match phonemes with timeline.</li>
	<li>Mouth shape from character pose library is selected based on phoneme and timeline.</li>
</ol><p>While this sounds like a dream (because it would be), I actually think the pieces for this are already out there. With Google Speech API, I can post my audio file to the AI and get reliable speech-to-text conversion with word confidence scores. If in our Python script, we set:</p><p><code>enable_word_time_offsets=True</code></p><p>we get the text results with time offsets for each word. I&#39;m going to check with Google, but I bet there is a debug flag available to get the offsets for time offsets for each individual phoneme. Why can&#39;t we use that data to re-associate the words with our timeline in Blender?</p><p>Working from the other end of the pipeline, I see a Papagayo product that puts text into mouth shapes, and I see a Blender addon called Automatic Lipsync that puts the Papagayo data into Blender.</p><p>Mission: Don&#39;t we now have the technology to put ALL of these together into either a addon plugin, or better yet core?</p><p>I&#39;m fairly new to Blender, so this is going to be above my skill level - but folks - although ambitious I see no reason why this isn&#39;t possible? Where do I begin to make this happen or what would be the most appropriate forum to further the discussion?</p><p>Resources:</p><p>A primer on <a href="https://cloudplatform.googleblog.com/2017/11/demystifying-ML-how-machine-learning-is-used-for-speech-recognition2.html">using Google Cloud Speech API and how speech recognition works</a>.</p><p>A page with<a href="https://cloud.google.com/speech/docs/async-time-offsets#speech-async-recognize-gcs-python" target="_blank"> instructions and example Python code</a> for processing audio with time offsets.</p><p>The official page for <a href="http://lostmarble.com/papagayo/" target="_blank">Papagayo</a>.</p><p>The official page for <a href="https://morevnaproject.org/2015/11/11/automatic-lipsync-animation-in-blender/" target="_blank">Lip Sync Add-on</a>.</p><p>And of course, <a href="https://www.blender.org/" target="_blank">Blender 3D</a>!</p>]]></description>
	<dc:creator>Gary Wright II</dc:creator>		</item>
</channel>
</rss>
