Unicode based 'tweet compressor' in Perl -


I want to implement myself, in fact it does the following though I am associated with some Unicode problems.

This is my script:

  #! Use / usr / bin / env perl warnings; Strict use; Print tweet_compress ('cc ms ns ps ls fi fl ffl ffi iv ix vi oy ii xi nj /,' \ ",", ''), "\ n"; subtweet_compress {my $ tweet = shift; $ tweet = ~ S / \? $ //; My @org = (qw / cc ms ns ps ls fi fl ffl ffi iv ix vi oy ii xi nj /, ".", ","); My @ new = qw / ㏄ ㎳ ㎱ ㎰ ㏌ ʪ ʪ ʪ ʪ ffff ⅳ ⅳ ⅳ ⅳ ⅵ  nj nj, /; $ Tweet = ~ s / $ original [$ _] / $ new [$ _] / g for 0 .. $ # Orig; return $ tweet;}  

but it prints junk at the terminal:

?.?. What is I doing wrong?

Two problems.

First of all, Use your utf8 pragma utf8 and to use your file.

In addition, if you run this program from the console If you intend to handle the Unicode, Windows Command Prompt will always show and what will it show? Whether your data is correct or not. I have run it on the Mac OS to handle UTF 8 on the terminal set.

Second, if your original "." , Then it is "one letter" and you get the wrong result - so you need to avoid it before using it in your regular expression. I have modified the program a bit to make it work.

  #! Use / Usr / bin / Env perl alerts; Strict use; Utf8; Use # character semantics # Ensure that the data is re-encoded in UTF 8 when output terminal binmode STDOUT, ': utf8'; Print tweet_compress ('cc ms ns ps ls fi fl ffl ffi iv ix vi oy ii xi nj /,' \ ",", ''), "\ n"; subtweet_compress {my $ tweet = shift; $ tweet = ~ S / \? $ //; My @org = (qw / cc ms ns ps ls fi fl ffl ffi iv ix vi oy ii xi nj /, '.', ','); My @ new = qw / ㏄ ㎳ ㎱ ㎰ ㏌ ʪ ʪ ʪ ʪ ffff ⅳ ⅳ ⅳ ⅳ ⅵ  nj nj, /; $ Tweet = ~ s / $ original [$ _] / $ new [$ _] / g for 0 .. $ # Orig; return $ tweet;}  

Comments