/kc/ - Krautchan

Highest Serious Discussion Per Post on Endchan


New Reply on thread #29287
X
Max 20 files0 B total
[New Reply]

[Index] [Catalog] [Banners] [Logs]
Posting mode: Reply [Return]


 >>/31324/
also, an interesting outcome of using phonetics is words that sound the same but are spelt differently
https://people.sc.fsu.edu/~jburkardt/fun/wordplay/multinyms.html
like: to too two, their there they're
a simple solution I've been doing is just keeping one of each homonym words in the dictionary so it always translates back to that word in english. open to suggestions though.
thumbnail of 5cbfbe9c4468b.jpeg
thumbnail of 5cbfbe9c4468b.jpeg
5cbfbe9c446... jpeg
(198.21 KB, 600x892)
i once toyed with the idea of making a script that will change all the words (in english) to close synonyms (or antonyms w/e) and ideally there would be a variable that can tune the "closeness" of synonyms (so you can also use more distant once for memetic effict)

its actually pretty easy to write you just need a quite big dictionary of words (as keys) and synonyms (as list of values) then basically:

for word in text:
    output += random.choice(synDict[word])

its nearly all code
 >>/31328/
So synDict is an associative array and "word" is a placeholder for an expression you want the synonym of?
That would be a long ass array.
And how do you decide which is the basic word (which gets to be the key) and which are just synonyms? Or every word is a key once and value several times simultaneously?
 >>/31332/
nohow i suppose

 >>/31331/
yes its a big [ass] dict (well, not actually, theres hardly more than 5k useful words in english, also im sure theres already dicts of synonyms (for stardict or something) or it can be parsed/ripped from dictionary sites online.

yes it can have multiple keys with sort-of same words as in both in keys and in values but i suppose that should not be a problem also keys are unique (i think)

it will be like this
synDict:
k: (american) -> v: [fat, stupid, yankee]
k: (german) -> v: [kraut, nazi]
k: (nazi) -> v: [german, kraut]

etc, in theory you just need make sure that the keys are unique and then pick random values (also, some online dicts of synonyms have some form of closeness already, so this "closeness" variable you just need to pick either more first once (from the beginning of the list of values) or more distant ones (so, from the end of the list)

its not that complex, most of the work is to build a dictionary from something
> And how do you decide which is the basic word (which gets to be the key) and which are just synonyms? 

i think in the "dumb" mode it will just replace the word multiple times without checking something like this, so:

"german german german" will become "nazi kraut nazi"

but in more sophisticated modes its possible to map some words (say, when they used/being replaced for the first time) and then reuse this map:

"german german german" will be: "nazi nazi nazi"
"german german nazi nazi" will be: "nazi nazi kraut kraut"

etc
thumbnail of cupfrogs_3.png
thumbnail of cupfrogs_3.png
cupfrogs_3 png
(68.37 KB, 250x246)
Pretty good thread going on. I was thinking about translating works of literature after the Bernedese language is more developed.

Maybe start with short novels and then move on to more extensive works.

Thoughts on this? The possibilities of this thing seems endless IMHO
I see a caveat in Berndese.
As is, it's good as a code, one can decipher it if he has the substitute table and knows about the phonetic rule. However writing in Berndese could result in reading, writing failures, difficulties in languages which use Latin script, since one would condition himself to use the Latin chars for different sounds. It's like when someone use a foreign language (largely inclusively tho) that could impair his ability to speak his native.

 >>/31336/
Yeah, that sounds ebin.

 >>/31341/
For starters some short stories, or jokes.
 >>/31342/
Yes, this is a fair consideration. I thought this over after the first comments were made about certain vowel sounds like i:. Ours is a unique situation where posters have different accents which affects pronunciation and thus a phonetic language. As such berndese cannot be read and written as English, to decipher a word it must be read aloud (or in head) which is slower than reading/writing the normal way. Similar to when some bernd writes "eggsburt" but it, in theory, shouldn't be any more difficult to decipher. The only way to know for certain is to try. An interesting side effect is seeing how bernd pronounces certain words, I can read what you write in your accent.
thumbnail of external-content.duckduckgo.com.jpg
thumbnail of external-content.duckduckgo.com.jpg
external-content.duck... jpg
(154.92 KB, 1338x669)
another simple recipe for a bernd language is to use different layout for english keyboard when typing, for example actually typing in Colemak but having Qwerty layout set in OS/editor (for berndish transliteration). Learning colemak also benefits typing (and is fun), so in a way its useful (not sure if reading/writing qwerty -> colemak translit is useful too)

basically its normal english with different alphabet that can be typed quite fast on any computer (even without colemak layout) after few weeks of leaning, reading it is probably a lot more complicated however
you also can buy a colemak keyboard or glue some letters on your old one if you need it

for example:
qwerty: "they can't quite get this message"
colemak: "fhko caj'f qilfk tkf fhld mkddatk"

and there are already tools and online converter/decoder:
https://colemak.com/Converter
 >>/31380/
yes, using a cipher on english is a simple encryption method but it's not a 'language' and it cannot be read without deciphering due to the nature of english. Another interesting solution is pictographic encypted messages, something that was popular on 8ch/cyber and lain/cyb, but again tools must be used to decipher the code, it cannot be read. I'm sorry I cannot remember the tool used for the pictures but it was a firefox extension which was cool.

 >>/31350/
hah, yes, I had thought it would make it easier to remember but it could be equally confusing.
I don't know about you but I was able to read that first line without translating, certainly finding it easier to read than to write atm.
eez ee, aa, oo qdw tqk^fr ks lj^rn z^rz^

> ks z^kk^ k^rxaa eez python wxboojdzsz^?
z^kzr, j^dfz^ nk^ gzfh k^kxee sx^nxaa
z^kk^ ks gzfhxaa lz^ k^kxee eez wxboojdzsz^ vx^ sx^nxaa. x^ fx^b "nrx^pqs"

> vnt, dj^ r^
? r^?

gqk^k z^rz^ ks z^kk^ fksdxaa?
thumbnail of berndesev2.zip
thumbnail of berndesev2.zip
berndesev2 zip
(5.34 KB, 0x0)
 >>/31405/
yes, this presented me with some thinking, well noticed. I'm against the use of '^' for anything other than the vowels but you are correct, special use cases need a z, not an s. For example 'spaz' which is not the same as spas or spats. I've chosen the 'i' character to represent this.

Also, I've made some updates that should make things easier. The scripts should be more user friendly now and I've expanded the dictionary. Please feel free to experiment. The bernd script will now allow you to spell words phonetically in english (with ^ indicators next to long vowels) and cipher them for you. Still not perfect yet and might be bugs so please give me feedback.
thumbnail of berndese-js.zip
thumbnail of berndese-js.zip
berndese-js zip
(26.11 KB, 0x0)
Wrote very simple html/js implementation just for fun. It preserves markup and tries to be somewhat user friendly. May contain bugs too.

I've used python script as reference implementation, but did some thing differently.

Just open index.html in browser. Techically, berndese.js may be used separately, it isn't tied to any markup. It may require relatively modern browser (FF/chrome is ok, latest palemoon works).
 >>/33311/
hah, I was going to say it was broken but 'spell text phonetically in english' was in the wrong box.

I like it very much. I didn't like having to import the separate regex library in the python script, plus your solution is much more elegant as it keeps the string rather than pulling bits out so you keep punctuation. However, I have found a bug with eng>bernd 'c' 'k' and 'f' 'ph'. I don't think this would be an issue if you just advise not to spell with 'k' or 'ph'.

I like the interface too but it needs some sort of mascot on the page. Berndese mascot?

tzsz^ mk^w k^ksb
 >>/33317/
> I didn't like having to import the separate regex library in the python script

There is nothing wrong in importing "re" module, because it is the part of standard library (and actually part of language as is). So, it is pure "vanilla" python without external dependencies.

> I have found a bug with eng>bernd 'c' 'k' and 'f' 'ph'.

I've used original translation table from python script v2 but put it into one big object, and swapped keys and values to get table for other side. But now I see that this is wrong, because variables with same key replace each other and some data is lost. I'll rewrite that part later maybe, and it will fix some problems.
 >>/33322/
I spent some time trying to make the python work without regex but I'm just not skilled enough. I might take another swing, you have inspired me.

I looked at the code before testing and saw that might be an issue but honestly I don't think it's a problem. I do think it needs a little help button to explain the vowel sounds anyway, just mention there not to spell with k or ph.
thumbnail of NFA-and-DFA-for-RegEx-abcd.png
thumbnail of NFA-and-DFA-for-RegEx-abcd.png
NFA-and-DFA-for-RegEx... png
(15.24 KB, 723x223)
thumbnail of berndese-js-v2.zip
thumbnail of berndese-js-v2.zip
berndese-j... zip
(26.15 KB, 0x0)
 >>/33337/
> I spent some time trying to make the python work without regex

I don't recommend this. Replacement of multiple overlapping patterns is tedious task. For example, taking Berndese string "bk^k" and replacing it into English: first we replace "bk^k" with "q", but next we'll match "q" (it also exists in Berndese) and replace it into "a". So, translation is ruined. No sorting or other method would help.

To overcome this, you need to keep track of replaced substrings (start/end indexes), but it isn't fun too, because replacements have different length than matching pattern. So, in the end you'll need to create finite automation like in regex (DFA/NFA). It isn't impossible, but worth only in educational purposes and not an easy task at all.

Also here is updated version with better table for replacements and some small fixes.
 >>/33370/
chuckled, very good

 >>/33366/
yeah, i ended up with a spaghetti code head fuck and even my own comments became alien to me, deleted it in frustration and just went with regex solution. Might take another swing anyway, just because it got under my skin.

Works well, good stuff. still needs a mascot though.
 >>/35846/
So input IPA instead of english for translation? That's actually not a bad idea, though not sure how I would personally use it. I should leave that to someone who has experience in using IPA.

I got to 816 words translated between me and hungry-ball. I figured I would wait to get to 1000 before doing a release with py2exe. I think though, we can do some in browser translation thanks to rus-ball and a greasemonkey script. Just how complex we can have it I'm not sure because I've not tried but I figured if I can free up some time soon I might get something done.
 >>/35849/
> So input IPA instead of english for translation?
Oh no, I mean for the letters used to represent sounds when the language is typed/written.
For example, if a word includes a "TH" sound, such as the one used in "that", it would be written with an IPA "ð"

Post(s) action:


Moderation Help
Scope:
Duration: Days

Ban Type:


New Reply on thread #29287
Max 20 files0 B total