I'm an old style DM, but have recently learned the value of digitally assisted RP tools. (Still, I'm only a few weeks to using D&D Beyond.) But I'm extremely interested and adding more content to my games. One technology I'm interested in is Voice Generation; but I'm guessing that many of you Dungeon Masters have already done a lot of research on this.
I'm looking for a program that allows me to "build" a character voice with tone, pitch, or even through sampling if necessary, and then run it during a campaign. So, for example, if I were to build a old, grumbling, Goblin Captain named Gharak voice, and had his voice queue up, the scenario below could happen:
Cleric(Player 1): "It appears we have walked into a filth-ridden cave. And who are you, goblin?" Voice: "Yes, you are on goblin land. Best mind your manners when speaking to me, boy."
Druid(Player 2): "Are you a leader of the goblins?" Voice: "I will ask the questions around here, Druid. Perhaps you should speak your business."
In the examples above, I'm responding with this voice model in real-time. My goal is to say anything I want in this voice, when needed.
Anybody seen anything like this? I'm thinking I should be able to build something in Python, if not. Anyhow, I'd really appreciate your experiences and advice.
Rollback Post to RevisionRollBack
Player since 1978. Dungeon Master since 1980.
Basic and Advanced Ed. - Still have my basic boxed set. Still have my Deities & Demigods (1st print Cthulhu/Melnibonean). 2nd Ed. 3.5, 5th Ed. - Played in various tournaments throughout the US back in the 80's and 90's; it was great to crawl with you all.
~Avid Nerd I work with data, data analysis, data science, automation programming by profession. If I can help out, feel free to ask.
I fear that I'm going to come across as rude here, but really I don't know how else to respond here.
'Tools' like the one you propose already exist. They've existed for a long time and are out there. The reason they're pretty obscure is because the cost and effort really isn't worth it. Consider the process.
1. DM creates a scenario 2. DM writes the pregenerated responses from an NPC 3. DM enters these into said program 4. Players handle the encounter in a non-standard way the DM hasn't prepared for.
In this scenario the tool isn't going to help. I've broken it before it's even been designed. So, what you actually need is something far more complex, likely comprising a LLM and an understanding of your planned scenarios and campaigns. That's a shed load of work and not one that's easily customisable. Which means what you're designing at this point is a computer game not a 'tool'.
Simple and harsh fact is that most reasonable DMs agree that character voices aren't a necessity. Yes, the overwhelming number of professional produced shows out there can make it seem necessary, but it's not. Compound this with the fact that to learn to drop your voice in pitch and tone, or raise it, or change pacing, or speak 'nasally' are all really quick methods of achieving the same goal...short of this being a response tool for the DM who is non-verbal, it's kinda pointless.
Your players aren't robots and aren't going to respond as well to a computerised voice. I've seen such things in action and they don't tend to have a great reception in person. Players tend to want a human response, not an artifical response - they can get that from video games.
I fear that I'm going to come across as rude here, but really I don't know how else to respond here.
'Tools' like the one you propose already exist. They've existed for a long time and are out there. The reason they're pretty obscure is because the cost and effort really isn't worth it. Consider the process.
1. DM creates a scenario 2. DM writes the pregenerated responses from an NPC 3. DM enters these into said program 4. Players handle the encounter in a non-standard way the DM hasn't prepared for.
In this scenario the tool isn't going to help. I've broken it before it's even been designed. So, what you actually need is something far more complex, likely comprising a LLM and an understanding of your planned scenarios and campaigns. That's a shed load of work and not one that's easily customisable. Which means what you're designing at this point is a computer game not a 'tool'.
Simple and harsh fact is that most reasonable DMs agree that character voices aren't a necessity. Yes, the overwhelming number of professional produced shows out there can make it seem necessary, but it's not. Compound this with the fact that to learn to drop your voice in pitch and tone, or raise it, or change pacing, or speak 'nasally' are all really quick methods of achieving the same goal...short of this being a response tool for the DM who is non-verbal, it's kinda pointless.
Your players aren't robots and aren't going to respond as well to a computerised voice. I've seen such things in action and they don't tend to have a great reception in person. Players tend to want a human response, not an artifical response - they can get that from video games.
I think that the OP is looking for a configurable text to voice program where he types in what he wants the voice to say (or even recites it) and then the software re-creates the specific voice saying those words. I suspect it could likely be done using some form of generative AI which is likely what the links you provided do. The work would be in creating the different character voices and then telling the software which voice parameters to use when reciting a specific section of text.
At the moment, I suspect it would be a very niche application except perhaps for replacing voice actors which is an entirely different question.
I think that the OP is looking for a configurable text to voice program where he types in what he wants the voice to say (or even recites it) and then the software re-creates the specific voice saying those words. I suspect it could likely be done using some form of generative AI which is likely what the links you provided do. The work would be in creating the different character voices and then telling the software which voice parameters to use when reciting a specific section of text.
At the moment, I suspect it would be a very niche application except perhaps for replacing voice actors which is an entirely different question.
And again, I say that these tools exist. The links above that I mention...they are already libraries of different 'voices'. There are APIs to basically take your text and have it read out using one of these voices. Heck text to speech has been around since at least the 80s.
Suspecting it to be niche may be quite correct, but given that it was an admitted goal of WotC/Hasbro and currently a LOT of marketing companies, social media platforms, and pretty much anyone not wanting to spend out extra on voice artists (VO work is more than just acting in terms of content) this is something that is hot button right now. It's partially why SAG-Aftra are out on strike.
In the context of TTRPGs, my initial point still remains - why would you spend the time typing in the response you want to make and burn that much time (which over the course of a session will quickly stack up), do you really think players' attention and patience is going to hold long enough for someone to spend 30 seconds typing in a response as opposed to just hearing the DM respond even if it's in the same voice over and over again.
The concept is just outright poorly thought through.
Either you are canning responses like a soundboard, in which case players aren't predictable enough for that to work. Or, you are spending time thinking, then typing the response, and waiting for the software to speak it in the 'character voice'.
Sometimes the simplest solution is the best - the proposed solution here isn't simple, it's naive at best, stupid at worst.
I fear that I'm going to come across as rude here, but really I don't know how else to respond here.
'Tools' like the one you propose already exist. They've existed for a long time and are out there. The reason they're pretty obscure is because the cost and effort really isn't worth it. Consider the process.
1. DM creates a scenario 2. DM writes the pregenerated responses from an NPC 3. DM enters these into said program 4. Players handle the encounter in a non-standard way the DM hasn't prepared for.
In this scenario the tool isn't going to help. I've broken it before it's even been designed. So, what you actually need is something far more complex, likely comprising a LLM and an understanding of your planned scenarios and campaigns. That's a shed load of work and not one that's easily customisable. Which means what you're designing at this point is a computer game not a 'tool'.
Simple and harsh fact is that most reasonable DMs agree that character voices aren't a necessity. Yes, the overwhelming number of professional produced shows out there can make it seem necessary, but it's not. Compound this with the fact that to learn to drop your voice in pitch and tone, or raise it, or change pacing, or speak 'nasally' are all really quick methods of achieving the same goal...short of this being a response tool for the DM who is non-verbal, it's kinda pointless.
Your players aren't robots and aren't going to respond as well to a computerised voice. I've seen such things in action and they don't tend to have a great reception in person. Players tend to want a human response, not an artifical response - they can get that from video games.
Hi there! No, no, you're not coming across rude at all!
Actually, you make great sense. The one difference was that I didn't want to create pre-made responses; I was talking about a response generated and executed in real-time.
And technically, I'm not doubting your opinion(or advice): all are welcome here! I just wanted to test it's viability. I think the only difference I have is whether the technology 1) actually works (meaning it can be used to create a non-premade response in real-time) and 2) is seamless (meaning that the act of executing the response doesn't take over 4 seconds).
If applicable, I'd probably test this technology a couple of times to see how effective it is. Nothing, "game-changing", and obviously, I have no desire as you mentioned "to ruin the voiceover industry". And lastly, I'd never a pay a dime for this; I'd just develop it myself if it's not already free.
Martin, you've made some great points, and I look forward to hearing from you again.
Rollback Post to RevisionRollBack
Player since 1978. Dungeon Master since 1980.
Basic and Advanced Ed. - Still have my basic boxed set. Still have my Deities & Demigods (1st print Cthulhu/Melnibonean). 2nd Ed. 3.5, 5th Ed. - Played in various tournaments throughout the US back in the 80's and 90's; it was great to crawl with you all.
~Avid Nerd I work with data, data analysis, data science, automation programming by profession. If I can help out, feel free to ask.
I think that the OP is looking for a configurable text to voice program where he types in what he wants the voice to say (or even recites it) and then the software re-creates the specific voice saying those words. I suspect it could likely be done using some form of generative AI which is likely what the links you provided do. The work would be in creating the different character voices and then telling the software which voice parameters to use when reciting a specific section of text.
At the moment, I suspect it would be a very niche application except perhaps for replacing voice actors which is an entirely different question.
And again, I say that these tools exist. The links above that I mention...they are already libraries of different 'voices'. There are APIs to basically take your text and have it read out using one of these voices. Heck text to speech has been around since at least the 80s.
Suspecting it to be niche may be quite correct, but given that it was an admitted goal of WotC/Hasbro and currently a LOT of marketing companies, social media platforms, and pretty much anyone not wanting to spend out extra on voice artists (VO work is more than just acting in terms of content) this is something that is hot button right now. It's partially why SAG-Aftra are out on strike.
In the context of TTRPGs, my initial point still remains - why would you spend the time typing in the response you want to make and burn that much time (which over the course of a session will quickly stack up), do you really think players' attention and patience is going to hold long enough for someone to spend 30 seconds typing in a response as opposed to just hearing the DM respond even if it's in the same voice over and over again.
The concept is just outright poorly thought through.
Either you are canning responses like a soundboard, in which case players aren't predictable enough for that to work. Or, you are spending time thinking, then typing the response, and waiting for the software to speak it in the 'character voice'.
Sometimes the simplest solution is the best - the proposed solution here isn't simple, it's naive at best, stupid at worst.
Hey again Martin! I can definitely hear the "Actor, Writer, Director & Teacher by day" tone in your responses and I highly respect that, sir. And I want you to know that I can understand how this tech bothers the acting industry.
But (for my part) this has nothing to do with replacing Voice Actors because I would never hire a Voice Actor for my games. (I guess I'm cheaper than I thought.) I hope we can laugh about all this, btw, because I'm not trying to insult them.
My counter-response to some of the points:
David42: You're correct - no pre-generated responses. I'd type in a phrase and the program speaks what I typed.
Siyaj_Kak: I'll check out autotune because I'm not sure. Thanks for the suggestion!
Waiting 30 seconds for the program to execute. No way! If it took that long, there's no point. (But I bet the processing and technology gets better and better.)
Using Text-to-Speech: None of them have the level of voice modulation capabilities we're talking about.
You know, I'm not even saying my proposal would work. I'm just interested in trying it out.
H.
Rollback Post to RevisionRollBack
Player since 1978. Dungeon Master since 1980.
Basic and Advanced Ed. - Still have my basic boxed set. Still have my Deities & Demigods (1st print Cthulhu/Melnibonean). 2nd Ed. 3.5, 5th Ed. - Played in various tournaments throughout the US back in the 80's and 90's; it was great to crawl with you all.
~Avid Nerd I work with data, data analysis, data science, automation programming by profession. If I can help out, feel free to ask.
I asked this question of one of my games today, and several people suggested a program called voice mod. It works with your microphone so that you speak into the microphone in your normal voice and other people hear the modified voice.
It's great that you're exploring digital tools to enhance your Dungeon Master experience. AI text-to-speech technology is indeed a valuable resource for creating custom character voices. You can use AI text to speech solutions to generate voices with specific tones, pitches, and even sample real voices to build unique character voices.
While there might not be a pre-built program that does this exactly as you described, you can create a solution using Python and AI text-to-speech APIs. You'd need to feed the text lines for your NPCs into the API, which would generate the corresponding voice lines in real-time. Many AI text-to-speech services are available, such as Google Text-to-Speech, AWS Polly, or even GPT-3-based models for more advanced capabilities.
Creating your solution in Python to integrate these services could allow you to have the level of control and customization you're looking for in your D&D campaigns. It's a creative approach to adding depth to your role-playing sessions. Good luck, and I hope your players enjoy the immersive experience!
Hey all im new to this world and would love to play but I have a stammer so I was thinking what about some sort of text to speech program? , is there any dnd specific ones or does anyone know any that work?
It sounds like you're diving into an exciting area of digitally assisted DMing! For what you're describing—a tool that allows you to build custom character voices with adjustable tone and pitch, and even use them in real-time—I recommend checking out https://audiomodify.com/.
AudioModify is a fantastic platform for AI-driven voice generation, and it might fit your needs perfectly. You can experiment with creating unique voices for your characters, like your grumbling Goblin Captain, Gharak. Their tools are designed to let you adjust tone, pitch, and other characteristics to craft the ideal voice model. Plus, it can be a great way to add a dynamic and immersive layer to your campaigns.
Give it a shot, and let us know how it works for you!
Rollback Post to RevisionRollBack
To post a comment, please login or register a new account.
Hi everybody!
I'm an old style DM, but have recently learned the value of digitally assisted RP tools. (Still, I'm only a few weeks to using D&D Beyond.) But I'm extremely interested and adding more content to my games. One technology I'm interested in is Voice Generation; but I'm guessing that many of you Dungeon Masters have already done a lot of research on this.
I'm looking for a program that allows me to "build" a character voice with tone, pitch, or even through sampling if necessary, and then run it during a campaign. So, for example, if I were to build a old, grumbling, Goblin Captain named Gharak voice, and had his voice queue up, the scenario below could happen:
Cleric(Player 1): "It appears we have walked into a filth-ridden cave. And who are you, goblin?"
Voice: "Yes, you are on goblin land. Best mind your manners when speaking to me, boy."
Druid(Player 2): "Are you a leader of the goblins?"
Voice: "I will ask the questions around here, Druid. Perhaps you should speak your business."
In the examples above, I'm responding with this voice model in real-time. My goal is to say anything I want in this voice, when needed.
Anybody seen anything like this? I'm thinking I should be able to build something in Python, if not. Anyhow, I'd really appreciate your experiences and advice.
Player since 1978. Dungeon Master since 1980.
Basic and Advanced Ed. - Still have my basic boxed set. Still have my Deities & Demigods (1st print Cthulhu/Melnibonean).
2nd Ed. 3.5, 5th Ed.
- Played in various tournaments throughout the US back in the 80's and 90's; it was great to crawl with you all.
~Avid Nerd
I work with data, data analysis, data science, automation programming by profession. If I can help out, feel free to ask.
I fear that I'm going to come across as rude here, but really I don't know how else to respond here.
'Tools' like the one you propose already exist. They've existed for a long time and are out there. The reason they're pretty obscure is because the cost and effort really isn't worth it. Consider the process.
1. DM creates a scenario
2. DM writes the pregenerated responses from an NPC
3. DM enters these into said program
4. Players handle the encounter in a non-standard way the DM hasn't prepared for.
In this scenario the tool isn't going to help. I've broken it before it's even been designed. So, what you actually need is something far more complex, likely comprising a LLM and an understanding of your planned scenarios and campaigns. That's a shed load of work and not one that's easily customisable. Which means what you're designing at this point is a computer game not a 'tool'.
Simple and harsh fact is that most reasonable DMs agree that character voices aren't a necessity. Yes, the overwhelming number of professional produced shows out there can make it seem necessary, but it's not. Compound this with the fact that to learn to drop your voice in pitch and tone, or raise it, or change pacing, or speak 'nasally' are all really quick methods of achieving the same goal...short of this being a response tool for the DM who is non-verbal, it's kinda pointless.
Your players aren't robots and aren't going to respond as well to a computerised voice. I've seen such things in action and they don't tend to have a great reception in person. Players tend to want a human response, not an artifical response - they can get that from video games.
If you're set on trying this though, I'd suggest looking at AI Voice Generator: Versatile Text to Speech Software | Murf AI or ElevenLabs - Generative AI Text to Speech & Voice Cloning it's pretty inaccessible compared to just speaking (even in your normal voice) in terms of cost. Worse still it perpetuates the ruination of an entire industry (the Voiceover industry).
DM session planning template - My version of maps for 'Lost Mine of Phandelver' - Send your party to The Circus - Other DM Resources - Maps, Tokens, Quests - 'Better' Player Character Injury Tables?
Actor, Writer, Director & Teacher by day - GM/DM in my off hours.
I think that the OP is looking for a configurable text to voice program where he types in what he wants the voice to say (or even recites it) and then the software re-creates the specific voice saying those words. I suspect it could likely be done using some form of generative AI which is likely what the links you provided do. The work would be in creating the different character voices and then telling the software which voice parameters to use when reciting a specific section of text.
At the moment, I suspect it would be a very niche application except perhaps for replacing voice actors which is an entirely different question.
Could something like this be done with autotune?
I've never used autotune and have only limited knowledge of its capabilities.
And again, I say that these tools exist. The links above that I mention...they are already libraries of different 'voices'. There are APIs to basically take your text and have it read out using one of these voices. Heck text to speech has been around since at least the 80s.
Suspecting it to be niche may be quite correct, but given that it was an admitted goal of WotC/Hasbro and currently a LOT of marketing companies, social media platforms, and pretty much anyone not wanting to spend out extra on voice artists (VO work is more than just acting in terms of content) this is something that is hot button right now. It's partially why SAG-Aftra are out on strike.
In the context of TTRPGs, my initial point still remains - why would you spend the time typing in the response you want to make and burn that much time (which over the course of a session will quickly stack up), do you really think players' attention and patience is going to hold long enough for someone to spend 30 seconds typing in a response as opposed to just hearing the DM respond even if it's in the same voice over and over again.
The concept is just outright poorly thought through.
Either you are canning responses like a soundboard, in which case players aren't predictable enough for that to work. Or, you are spending time thinking, then typing the response, and waiting for the software to speak it in the 'character voice'.
Sometimes the simplest solution is the best - the proposed solution here isn't simple, it's naive at best, stupid at worst.
DM session planning template - My version of maps for 'Lost Mine of Phandelver' - Send your party to The Circus - Other DM Resources - Maps, Tokens, Quests - 'Better' Player Character Injury Tables?
Actor, Writer, Director & Teacher by day - GM/DM in my off hours.
Hi there! No, no, you're not coming across rude at all!
Actually, you make great sense. The one difference was that I didn't want to create pre-made responses; I was talking about a response generated and executed in real-time.
And technically, I'm not doubting your opinion(or advice): all are welcome here! I just wanted to test it's viability. I think the only difference I have is whether the technology 1) actually works (meaning it can be used to create a non-premade response in real-time) and 2) is seamless (meaning that the act of executing the response doesn't take over 4 seconds).
If applicable, I'd probably test this technology a couple of times to see how effective it is. Nothing, "game-changing", and obviously, I have no desire as you mentioned "to ruin the voiceover industry". And lastly, I'd never a pay a dime for this; I'd just develop it myself if it's not already free.
Martin, you've made some great points, and I look forward to hearing from you again.
Player since 1978. Dungeon Master since 1980.
Basic and Advanced Ed. - Still have my basic boxed set. Still have my Deities & Demigods (1st print Cthulhu/Melnibonean).
2nd Ed. 3.5, 5th Ed.
- Played in various tournaments throughout the US back in the 80's and 90's; it was great to crawl with you all.
~Avid Nerd
I work with data, data analysis, data science, automation programming by profession. If I can help out, feel free to ask.
Hey again Martin! I can definitely hear the "Actor, Writer, Director & Teacher by day" tone in your responses and I highly respect that, sir. And I want you to know that I can understand how this tech bothers the acting industry.
But (for my part) this has nothing to do with replacing Voice Actors because I would never hire a Voice Actor for my games. (I guess I'm cheaper than I thought.) I hope we can laugh about all this, btw, because I'm not trying to insult them.
My counter-response to some of the points:
You know, I'm not even saying my proposal would work. I'm just interested in trying it out.
H.
Player since 1978. Dungeon Master since 1980.
Basic and Advanced Ed. - Still have my basic boxed set. Still have my Deities & Demigods (1st print Cthulhu/Melnibonean).
2nd Ed. 3.5, 5th Ed.
- Played in various tournaments throughout the US back in the 80's and 90's; it was great to crawl with you all.
~Avid Nerd
I work with data, data analysis, data science, automation programming by profession. If I can help out, feel free to ask.
I asked this question of one of my games today, and several people suggested a program called voice mod. It works with your microphone so that you speak into the microphone in your normal voice and other people hear the modified voice.
Hey,
It's great that you're exploring digital tools to enhance your Dungeon Master experience. AI text-to-speech technology is indeed a valuable resource for creating custom character voices. You can use AI text to speech solutions to generate voices with specific tones, pitches, and even sample real voices to build unique character voices.
While there might not be a pre-built program that does this exactly as you described, you can create a solution using Python and AI text-to-speech APIs. You'd need to feed the text lines for your NPCs into the API, which would generate the corresponding voice lines in real-time. Many AI text-to-speech services are available, such as Google Text-to-Speech, AWS Polly, or even GPT-3-based models for more advanced capabilities.
Creating your solution in Python to integrate these services could allow you to have the level of control and customization you're looking for in your D&D campaigns. It's a creative approach to adding depth to your role-playing sessions. Good luck, and I hope your players enjoy the immersive experience!
Hey all im new to this world and would love to play but I have a stammer so I was thinking what about some sort of text to speech program? , is there any dnd specific ones or does anyone know any that work?
Hi there!
It sounds like you're diving into an exciting area of digitally assisted DMing! For what you're describing—a tool that allows you to build custom character voices with adjustable tone and pitch, and even use them in real-time—I recommend checking out https://audiomodify.com/.
AudioModify is a fantastic platform for AI-driven voice generation, and it might fit your needs perfectly. You can experiment with creating unique voices for your characters, like your grumbling Goblin Captain, Gharak. Their tools are designed to let you adjust tone, pitch, and other characteristics to craft the ideal voice model. Plus, it can be a great way to add a dynamic and immersive layer to your campaigns.
Give it a shot, and let us know how it works for you!