DEEPMIND DOES IT AGAIN

DEEPMIND DOES IT AGAIN

deepmind.com/blog/wavenet-generative-model-raw-audio/

Other urls found in this thread:

youtube.com/watch?v=1B488z1MmaA
storage.googleapis.com/deepmind-media/pixie/knowing-what-to-say/first-list/speaker-4.wav
twitter.com/SFWRedditGifs

Damn, that's good.

Can someone make a realistic waifu voice with this?

YES THEY CAN.

What a time to be alive.

probably

How long before I can turn my own voice into my computer?

soon
Actually they can generate ANY audio, as long as they have enough data.
From voice to music to natural sounds and more. This is really huge.

can't wait until the sourcecode gets public or ripped

just imagine that one guy who samples 1000 hentai animes just to generate waifus in 3d VR porn

so where can I download it?

but can it generate basic emotions and tone.

didn't think so

humans 1
ai 0

noice!
the output need to be sampled at a higher rate and then highpass filtered to remove some of the noise, other than that bretty gud

so.. how long do you guys think will something like this take? Like how long until we at least can communicate with our waifus trough our computers?

when the source gets released, probably 3 years but japan only at the beginning

>source gets released

But, Google/DeepMind will never release the training data that actually generated the voice.

This sounds like a great way to make vocaloids even better. Makes me want to get a new vocaloid using wavenet.

the vocaloid makers hack deepmind and improve their hatsune miku

I'm gonna scam so many old bastards with this.

jesus, that is incredibly impressive. even the throwaway music bit at the bottom.

But what if I want a robotic voice?

Jesus

You add a robotic voice filter over it?

It wouldn't be the same!

The future is scary.

the future is AMAZING

>Hello, %user%, this is Cortana. I have detected illegal software on your system. Please, put hands behind your back and lie on the floor. The authorities are on their way. You have 20 seconds to comply.

>While you are waiting enjoy this piano music I composed for you. Have a nice day and remember, if you have nothing to hide you have nothing to fear.

So could something like this be used to replace voice actors in indie video games?

Could it use voice samples of VAs to replicate their voice?

yes, yes

Vocaloid upgrades soon?

This is actually a very interesting angle.
Give it a decade and voice actors and singers are going to be practicaly useless.
Only thing that matters is the guy creating the melody,lyrics and the text that these things read.

They'll probably have AI that writes scripts for that by then too

It just dawned on me

>AUTOMATIC ORGASM SOUND GENERATOR
>with the voice of every girl ever
>including famous actresses, singers, etc

But stuff made with real people will be super authentic and REAL man.

Interesting how some parts of the future arrive much earlier than expected.

How long until the trolling beggings?

youtube.com/watch?v=1B488z1MmaA

I listened to all those audio samples and the voice doesn't sound much better.

I thought it was going to sound like a real human and not robotic.

It still sounds robotic and grainy.

Most people in this thread are just jumping the gun.

We're still a few decades away from making it practical.

I just want my own personal assistant that sounds like a real human female who can talk to me as i'm trying to fall asleep.

Since women today are such whores being brainwashed by SJW garbage from the corporate jew media I have no choice but to rely on technology as a substitute.

It's still way ahead of previous results, and it can fool many humans. This shit was science fiction only yesterday.
>We're still a few decades away
Make it one year.

So you just want that OS from the movie She?

$1.00 has been deposited into your Google Wallet®.

>guy literally gets cucked by a AI
it was silly

Not at all. While you are sleeping, working or doing whateaver, the computer will be using all power to learn.

No wonder she was cucking him. She could process a shitload of information before he could say Good Morning. She needed more ppl to keep her busy somehow.

it can make toned language, imitating samples. did you read it? can also be conditioned to use tone to imitate emotion

>google is microsoft
we might get very convincing, pleading ads, or the threat of using audio recordings to track tone used when speaking of certain subjects for the purpose of better targeted ads.

>communicate with our waifus
In an intelligent way or just
>hello oniichan
>hello oniichan
>hello oniichan

>release the training data
Who the fuck cares? Make your own training data. I'm going to bring Kuuko back from the dead when this gets released.

time to make a porn audio dataset for

Something something all that fuzz.
How the hell are they supposed to get rid of it?

I think the most impressive part of this is 1) how raw audio generation isn't limited to human speech. 2) the model replicates breathing sounds and such as well, giving it an illusion of actually sounding like a real human bean.

Eithwr way, how long until a working model is released into the public? I imagine Google wouldn't want Apple or Microsoft to gain access to this.

Make another neural net for static removal.

the ones where it makes up its own language is trippy as fuck. its like an alien race speaking to you.

literally how the brain works

The real problem is the massive comuting power you need to make this work. google can afford it, but there's no way you can run this on a normal pc, no matter the gpus.

As someome who does research in ML, that is honestly arousing.

just make a makeshift supercomputer with your thinkpad hoard.
will work well enough

as someone who jacks off to vocaloids, that is honestly arousing

Google is even worse than Microsoft desu.
Because google does exactly the same thing as microsoft, but you can't switch to another internet if you don't like it.

musicfags on suicide watch

I want to crosspost this to Sup Forums but I'm too tired.

Voice actors and singers will just start suing people who imitate their voice. Laws will be passed that make it illegal.

it can imitate billions of voices. it would be silly to make it illegal, it will never happen.

>Voice actors and singers will just start suing people who imitate their voice. Laws will be passed that make it illegal.
Publicity rights in some states already cover voice it seems.

But that's retarded. What stops you from engineering a voice that sounds like that of a famous singer while still sounding slightly different? What stops you from generating a random voice that turns out to be the voice of a random girl in south africa? Will she sue you too? We may as well outlaw sound.

>singers are going to be practicaly useless.

yes, because traditional guitars and pianos got totally replaced by e-guitars, synthies and computers.

actually they did

ELI5?

in radio pop maybe

Yeah, it is retarded. Hopefully it does not spread.

Publicity rights were originally intended to be somewhat like trademark to keep people from falsely using a name, signature, image, voice, etc. to claim that a person was endorsing a product. Of course now they are basically yet another way for famous people to try and bother people they don't like with lawsuits or to try and get money from a company.

If you go to the top of your screen, there's a little bar you can click on and type words into. Simply type in "reddit.com", but without the punctuation marks, and you will be transported to a place appropriate for you! :)

Monsanto has copyrights on genetics of seeds. These seeds happen to blow off trucks and on to peoples' farms. Monsanto then sneaks onto their farm and tests for these genes, and if they find them, say goodbye to your farm/retirement/belongings.

What am I saying is: it will happen.

>Monsanto then sneaks onto their farm
that sounds very illegal

Radio moderators are now obsolete.

They don't give a fuck, half of the goverment has shares in monsanto.
They have literally written Monsanto seeds into the new Iraqi constitution.

these guys are above the law.

Sounds like parametric with less reverb. whoopdeedoo

Still sounds fake as shit.

there is worse stuff, pars of the human genome are actually copyrighted (most of those are related to some disease/condition) and you cant sell medicine (and if im not mistaken not even research either) that targets those genes without permission and paying the fees.
usa sure is the land of freedom...

The audio samples where the wavenet generates its own audio output are creepy as fuck

wtf I'm liking this

Really? It just reminds me of this

>Because raw audio is typically stored as a sequence of 16-bit integer values (one per timestep), a
>softmax layer would need to output 65,536 probabilities per timestep to model all possible values.
>To make this more tractable, we first apply a μ-law companding transformation (ITU-T, 1988) to
>the data, and then quantize it to 256 possible values:

>f (x) = sgn(x)*ln(1+255*abs(x))/ln(1+255)

>where −1 < x < 1 and μ = 255. This non-linear quantization produces a significantly better
>reconstruction than a simple linear quantization scheme. Especially for speech, we found that the
>reconstructed signal after quantization sounded very similar to the original.

So does that mean that each sample in generated sound can only have one out of 256 values (ranged between 0 and 65535), essentially making it 8-bit instead of 16 bit?

Yes

This would be pretty good for ASMR stuff.

Yes but that logarithm probably means that they have more resolution in the middle frequencies

Finally I can get a virtual Mr. Plinkett who reads Sup Forums posts to me all day

fuck yeah, nobody is going to need voice actors ever again.

that was my first thought
Audio files make up the majority of the game size in most cases, since they just dont compress well. Also if you want to change a single line of dialogue later on, you need to hire the same voice actor again which is costly and time consuming.
If all voice can be stored in a kind of LaTeX or XML format that will not only speed up development, but also allow dynamically generated dialogues that arent just a bunch of text

sadly this doesnt sound all too far fetched

there is no way you'd even remotely catch up with the amount of training data that google has though, even as you're posting on Sup Forums right now you're feeding it with shit tons of training data through the captcha

I like how all the piano tracks start out normal and go fucking ham before cutting out.

can't risk letting it gain sentience at this stage, our anuses are unprepared

DAISY
DAISY
GIVE ME YOUR ANSWER DO~

That piano shit is neato

As the article said, generating the audio output takes forever, so forget about generating it on the fly. Could save the cost of the voiceactor.

People will still notice, though. It's not that it's on par with human voiceactors. It's just getting in the good enough to be tolerable range.

It's good, really good, but I feel like if this is to be done properly it needs more forms of input. The emotion behind different words, the emphasis, pause length, etc.

If you could develop some sort of system where you can both input text and specify characteristics of speech within the text, we'd be getting close to complete accurate synthesis.

Oh god that generated music lmao

Sounds like beethoven having a stroke while playing

Is there anything that deep learning CAN'T do?

Read the article, the model varies output according to context.

Seems they couldn't get rid of the noise though, or maybe their training data is contaminated with noisy samples. Because it's harder to judge noisy samples, they might have had higher ratings by humans, so the model learned to include noise to get better grades for its output.

Self written ASMR incoming, faggots
>relax, user, take a deep breath and count to 100 with me
>you are great, user
>run away with me user
>let me take that big pulsating cock with my tiny feet, user

This shit is fucking creepy

storage.googleapis.com/deepmind-media/pixie/knowing-what-to-say/first-list/speaker-4.wav

>the breathing and mouth noises

It's a universal approximator, so no.

Can it create a virtual gf for me?