DEEPMIND DOES IT AGAIN

Question

DEEPMIND DOES IT AGAIN

Cameron Jackson

deepmind.com/blog/wavenet-generative-model-raw-audio/

September 8, 2016 - 07:57

Other urls found in this thread:

youtube.com/watch?v=1B488z1MmaA
storage.googleapis.com/deepmind-media/pixie/knowing-what-to-say/first-list/speaker-4.wav
twitter.com/SFWRedditGifs

Noah Nelson

Damn, that's good.

September 8, 2016 - 07:59

Henry Parker

Can someone make a realistic waifu voice with this?

September 8, 2016 - 08:02

Camden Kelly

YES THEY CAN.

What a time to be alive.

September 8, 2016 - 08:03

Lucas Rogers

probably

September 8, 2016 - 08:03

Sebastian King

How long before I can turn my own voice into my computer?

September 8, 2016 - 08:08

Adrian Hill

soon
Actually they can generate ANY audio, as long as they have enough data.
From voice to music to natural sounds and more. This is really huge.

September 8, 2016 - 08:12

Julian Ross

can't wait until the sourcecode gets public or ripped

just imagine that one guy who samples 1000 hentai animes just to generate waifus in 3d VR porn

September 8, 2016 - 08:14

Christopher Powell

so where can I download it?

September 8, 2016 - 08:18

Angel Green

but can it generate basic emotions and tone.

didn't think so

humans 1
ai 0

September 8, 2016 - 08:20

Christopher Cruz

noice!
the output need to be sampled at a higher rate and then highpass filtered to remove some of the noise, other than that bretty gud

September 8, 2016 - 08:22

Lincoln Ward

so.. how long do you guys think will something like this take? Like how long until we at least can communicate with our waifus trough our computers?

September 8, 2016 - 08:24

Camden Sullivan

when the source gets released, probably 3 years but japan only at the beginning

September 8, 2016 - 08:25

Lucas Richardson

>source gets released

But, Google/DeepMind will never release the training data that actually generated the voice.

September 8, 2016 - 08:43

Liam Peterson

This sounds like a great way to make vocaloids even better. Makes me want to get a new vocaloid using wavenet.

September 8, 2016 - 08:50

Brayden Martin

the vocaloid makers hack deepmind and improve their hatsune miku

September 8, 2016 - 08:52

Luis Lee

I'm gonna scam so many old bastards with this.

September 8, 2016 - 08:54

Joshua Edwards

jesus, that is incredibly impressive. even the throwaway music bit at the bottom.

September 8, 2016 - 08:58

Logan Thomas

But what if I want a robotic voice?

September 8, 2016 - 09:00

Asher Sanchez

Jesus

September 8, 2016 - 09:05

Michael Johnson

You add a robotic voice filter over it?

September 8, 2016 - 09:07

Jackson Mitchell

It wouldn't be the same!

September 8, 2016 - 09:09

John Reyes

The future is scary.

September 8, 2016 - 09:26

Jayden James

the future is AMAZING

September 8, 2016 - 09:28

Cameron Morgan

>Hello, %user%, this is Cortana. I have detected illegal software on your system. Please, put hands behind your back and lie on the floor. The authorities are on their way. You have 20 seconds to comply.

September 8, 2016 - 09:32

Julian Thompson

>While you are waiting enjoy this piano music I composed for you. Have a nice day and remember, if you have nothing to hide you have nothing to fear.

September 8, 2016 - 09:35

Cameron Sullivan

So could something like this be used to replace voice actors in indie video games?

Could it use voice samples of VAs to replicate their voice?

September 8, 2016 - 09:41

Brody Thompson

yes, yes

September 8, 2016 - 10:02

Mason Jones

Vocaloid upgrades soon?

September 8, 2016 - 10:06

Camden Collins

This is actually a very interesting angle.
Give it a decade and voice actors and singers are going to be practicaly useless.
Only thing that matters is the guy creating the melody,lyrics and the text that these things read.

September 8, 2016 - 10:13

Chase Ross

They'll probably have AI that writes scripts for that by then too

September 8, 2016 - 10:15

Hunter Roberts

It just dawned on me

>AUTOMATIC ORGASM SOUND GENERATOR
>with the voice of every girl ever
>including famous actresses, singers, etc

September 8, 2016 - 10:17

Cooper Taylor

But stuff made with real people will be super authentic and REAL man.

September 8, 2016 - 10:21

Jordan Hall

Interesting how some parts of the future arrive much earlier than expected.

September 8, 2016 - 10:25

Brayden Hall

How long until the trolling beggings?

youtube.com/watch?v=1B488z1MmaA

September 8, 2016 - 10:50

Jason Carter

I listened to all those audio samples and the voice doesn't sound much better.

I thought it was going to sound like a real human and not robotic.

It still sounds robotic and grainy.

September 8, 2016 - 10:50

Oliver Campbell

Most people in this thread are just jumping the gun.

We're still a few decades away from making it practical.

September 8, 2016 - 10:53

Carson Harris

I just want my own personal assistant that sounds like a real human female who can talk to me as i'm trying to fall asleep.

Since women today are such whores being brainwashed by SJW garbage from the corporate jew media I have no choice but to rely on technology as a substitute.

September 8, 2016 - 10:56

Anthony Campbell

It's still way ahead of previous results, and it can fool many humans. This shit was science fiction only yesterday.
>We're still a few decades away
Make it one year.

September 8, 2016 - 10:59

Liam Richardson

So you just want that OS from the movie She?

September 8, 2016 - 11:04

Luis Lee

$1.00 has been deposited into your Google Wallet®.

September 8, 2016 - 11:06

Isaiah Howard

>guy literally gets cucked by a AI
it was silly

September 8, 2016 - 11:08

Tyler Hill

Not at all. While you are sleeping, working or doing whateaver, the computer will be using all power to learn.

No wonder she was cucking him. She could process a shitload of information before he could say Good Morning. She needed more ppl to keep her busy somehow.

September 8, 2016 - 11:11

Luke Lopez

it can make toned language, imitating samples. did you read it? can also be conditioned to use tone to imitate emotion

September 8, 2016 - 11:11

Jacob Baker

>google is microsoft
we might get very convincing, pleading ads, or the threat of using audio recordings to track tone used when speaking of certain subjects for the purpose of better targeted ads.

September 8, 2016 - 11:14

Brody Martin

>communicate with our waifus
In an intelligent way or just
>hello oniichan
>hello oniichan
>hello oniichan

September 8, 2016 - 11:14

Joseph Sanders

>release the training data
Who the fuck cares? Make your own training data. I'm going to bring Kuuko back from the dead when this gets released.

September 8, 2016 - 11:15

Bentley Peterson

time to make a porn audio dataset for

September 8, 2016 - 11:18

Alexander Bennett

Something something all that fuzz.
How the hell are they supposed to get rid of it?

September 8, 2016 - 11:19

Grayson Lopez

I think the most impressive part of this is 1) how raw audio generation isn't limited to human speech. 2) the model replicates breathing sounds and such as well, giving it an illusion of actually sounding like a real human bean.

Eithwr way, how long until a working model is released into the public? I imagine Google wouldn't want Apple or Microsoft to gain access to this.

September 8, 2016 - 11:23

David Long

Make another neural net for static removal.

September 8, 2016 - 11:24

Dominic Thompson

the ones where it makes up its own language is trippy as fuck. its like an alien race speaking to you.

September 8, 2016 - 11:27

Oliver Butler

literally how the brain works

The real problem is the massive comuting power you need to make this work. google can afford it, but there's no way you can run this on a normal pc, no matter the gpus.

September 8, 2016 - 11:28

Zachary Rivera

As someome who does research in ML, that is honestly arousing.

September 8, 2016 - 11:34

Jack Rivera

just make a makeshift supercomputer with your thinkpad hoard.
will work well enough

September 8, 2016 - 12:21

Julian Jenkins

as someone who jacks off to vocaloids, that is honestly arousing

September 8, 2016 - 12:34

Dylan Lewis

Google is even worse than Microsoft desu.
Because google does exactly the same thing as microsoft, but you can't switch to another internet if you don't like it.

September 8, 2016 - 12:46

Joshua Richardson

musicfags on suicide watch

September 8, 2016 - 13:03

Nathan Roberts

I want to crosspost this to Sup Forums but I'm too tired.

September 8, 2016 - 13:10

Nicholas Mitchell

Voice actors and singers will just start suing people who imitate their voice. Laws will be passed that make it illegal.

September 8, 2016 - 13:11

Xavier Johnson

it can imitate billions of voices. it would be silly to make it illegal, it will never happen.

September 8, 2016 - 13:15

Justin Jenkins

>Voice actors and singers will just start suing people who imitate their voice. Laws will be passed that make it illegal.
Publicity rights in some states already cover voice it seems.

September 8, 2016 - 13:16

Ethan White

But that's retarded. What stops you from engineering a voice that sounds like that of a famous singer while still sounding slightly different? What stops you from generating a random voice that turns out to be the voice of a random girl in south africa? Will she sue you too? We may as well outlaw sound.

September 8, 2016 - 13:23

Jayden Gomez

>singers are going to be practicaly useless.

yes, because traditional guitars and pianos got totally replaced by e-guitars, synthies and computers.

September 8, 2016 - 13:24

Matthew Myers

actually they did

September 8, 2016 - 13:26

Nathan Long

ELI5?

September 8, 2016 - 13:27

Jonathan Miller

in radio pop maybe

September 8, 2016 - 13:37

Jason Jones

Yeah, it is retarded. Hopefully it does not spread.

Publicity rights were originally intended to be somewhat like trademark to keep people from falsely using a name, signature, image, voice, etc. to claim that a person was endorsing a product. Of course now they are basically yet another way for famous people to try and bother people they don't like with lawsuits or to try and get money from a company.

September 8, 2016 - 13:41

Nicholas Martin

If you go to the top of your screen, there's a little bar you can click on and type words into. Simply type in "reddit.com", but without the punctuation marks, and you will be transported to a place appropriate for you! :)

September 8, 2016 - 13:42

David Perez

Monsanto has copyrights on genetics of seeds. These seeds happen to blow off trucks and on to peoples' farms. Monsanto then sneaks onto their farm and tests for these genes, and if they find them, say goodbye to your farm/retirement/belongings.

What am I saying is: it will happen.

September 8, 2016 - 13:43

Kevin Morgan

>Monsanto then sneaks onto their farm
that sounds very illegal

September 8, 2016 - 13:56

Levi Gutierrez

Radio moderators are now obsolete.

September 8, 2016 - 14:03

Jeremiah King

They don't give a fuck, half of the goverment has shares in monsanto.
They have literally written Monsanto seeds into the new Iraqi constitution.

these guys are above the law.

September 8, 2016 - 14:06

Jackson Evans

Sounds like parametric with less reverb. whoopdeedoo

Still sounds fake as shit.

September 8, 2016 - 14:10

Levi Torres

there is worse stuff, pars of the human genome are actually copyrighted (most of those are related to some disease/condition) and you cant sell medicine (and if im not mistaken not even research either) that targets those genes without permission and paying the fees.
usa sure is the land of freedom...

September 8, 2016 - 14:14

Jose Barnes

The audio samples where the wavenet generates its own audio output are creepy as fuck

September 8, 2016 - 14:20

Oliver Robinson

wtf I'm liking this

September 8, 2016 - 14:25

Levi Mitchell

Really? It just reminds me of this

September 8, 2016 - 14:43

Nolan Thomas

>Because raw audio is typically stored as a sequence of 16-bit integer values (one per timestep), a
>softmax layer would need to output 65,536 probabilities per timestep to model all possible values.
>To make this more tractable, we first apply a μ-law companding transformation (ITU-T, 1988) to
>the data, and then quantize it to 256 possible values:

>f (x) = sgn(x)*ln(1+255*abs(x))/ln(1+255)

>where −1 < x < 1 and μ = 255. This non-linear quantization produces a significantly better
>reconstruction than a simple linear quantization scheme. Especially for speech, we found that the
>reconstructed signal after quantization sounded very similar to the original.

So does that mean that each sample in generated sound can only have one out of 256 values (ranged between 0 and 65535), essentially making it 8-bit instead of 16 bit?

September 8, 2016 - 16:12

Lincoln Brown

Yes

September 8, 2016 - 18:06

Jose Rivera

This would be pretty good for ASMR stuff.

September 8, 2016 - 18:36

Kevin Moore

Yes but that logarithm probably means that they have more resolution in the middle frequencies

September 8, 2016 - 18:47

Jace Myers

Finally I can get a virtual Mr. Plinkett who reads Sup Forums posts to me all day

September 8, 2016 - 18:49

Henry Stewart

fuck yeah, nobody is going to need voice actors ever again.

September 8, 2016 - 18:56

Josiah Brown

that was my first thought
Audio files make up the majority of the game size in most cases, since they just dont compress well. Also if you want to change a single line of dialogue later on, you need to hire the same voice actor again which is costly and time consuming.
If all voice can be stored in a kind of LaTeX or XML format that will not only speed up development, but also allow dynamically generated dialogues that arent just a bunch of text

September 8, 2016 - 19:06

Nathaniel Diaz

sadly this doesnt sound all too far fetched

September 8, 2016 - 19:08

Cooper Edwards

there is no way you'd even remotely catch up with the amount of training data that google has though, even as you're posting on Sup Forums right now you're feeding it with shit tons of training data through the captcha

September 8, 2016 - 19:10

Evan Wilson

I like how all the piano tracks start out normal and go fucking ham before cutting out.

September 8, 2016 - 19:11

Jordan Cook

can't risk letting it gain sentience at this stage, our anuses are unprepared

September 8, 2016 - 19:14

Austin Howard

DAISY
DAISY
GIVE ME YOUR ANSWER DO~

September 8, 2016 - 19:19

Charles Edwards

That piano shit is neato

September 8, 2016 - 19:33

Aaron Hall

As the article said, generating the audio output takes forever, so forget about generating it on the fly. Could save the cost of the voiceactor.

People will still notice, though. It's not that it's on par with human voiceactors. It's just getting in the good enough to be tolerable range.

September 8, 2016 - 19:35

Christopher Thomas

It's good, really good, but I feel like if this is to be done properly it needs more forms of input. The emotion behind different words, the emphasis, pause length, etc.

If you could develop some sort of system where you can both input text and specify characteristics of speech within the text, we'd be getting close to complete accurate synthesis.

September 8, 2016 - 19:36

Parker Johnson

Oh god that generated music lmao

Sounds like beethoven having a stroke while playing

September 8, 2016 - 19:39

Bentley Wilson

Is there anything that deep learning CAN'T do?

September 8, 2016 - 19:40

John Hill

Read the article, the model varies output according to context.

Seems they couldn't get rid of the noise though, or maybe their training data is contaminated with noisy samples. Because it's harder to judge noisy samples, they might have had higher ratings by humans, so the model learned to include noise to get better grades for its output.

September 8, 2016 - 19:42

Nathan Edwards

Self written ASMR incoming, faggots
>relax, user, take a deep breath and count to 100 with me
>you are great, user
>run away with me user
>let me take that big pulsating cock with my tiny feet, user

September 8, 2016 - 19:42

Camden Thompson

This shit is fucking creepy

storage.googleapis.com/deepmind-media/pixie/knowing-what-to-say/first-list/speaker-4.wav

>the breathing and mouth noises

September 8, 2016 - 19:43

Bentley Jones

It's a universal approximator, so no.

September 8, 2016 - 19:43

Jonathan Young

Can it create a virtual gf for me?

September 8, 2016 - 19:48

1 2 ... 10 Next

DEEPMIND DOES IT AGAIN

Last threads