Transkodieren in Python
main.py
import sys
import base64
text = 'plain text'
key = "key"
l = len( key )
y = bytes()
for i, c in enumerate( text ):
k = key[ i % l ]
d = chr( ord( c )^ord( k )).encode()
y += d
print( text := base64.b64encode( y ))
import sys
import base64
text = base64.b64decode( b'cryptotext' ).decode( "utf_8" )
key = "key"
l = len( key )
y = bytes()
for i, c in enumerate( text ):
k = key[ i % l ]
d = chr( ord( c )^ord( k )).encode()
y += d
__import__( "sys" ).stdout.flush()
__import__( "sys" ).stdout.buffer.write(y)
__import__( "sys" ).stdout.flush()- Protokoll
print( ( ''.join( [ f'{x:02x} ' for x in b"source" ])))
print( ( ''.join( [ f'{x:02x} ' for x in "source".encode( 'ascii' ) ])))
print( ( ''.join( [ f'{x:02x} ' for x in __import__( "urllib.parse" ).parse.quote( "source", encoding='iso8859-1' ).encode( 'iso8859-1' ) ])))
bytes.fromhex( '41 42 43' )
bytes.fromhex( '414243' )
main.py
print( __import__( "quopri" ).decodestring( '=9a' ))
- Protokoll
b'\x9a'
main.py
print( *"Hi, I'm plain ASCII!".encode( 'iso8859-1' ))
- Protokoll
main.py
print( ''.join( [ format( b, '08b' )for b in "Hi, I'm plain ASCII!".encode( 'US-ASCII' )]))
- Protokoll
0100100001101001001011000010000001001001001001110110110100100000011100000110110001100001011010010110111000100000010000010101001101000011010010010100100100100001
main.py
from urllib.parse import unquote
print( unquote( 'Z6%2BH%2B%2Ff5OpQ%3D', encoding='iso8859-1' ))
- Protokoll
Z6+H+/f5OpQ=
Quoted Printable
Header (Dekodieren von UTF-8/base64)
b64decode erwarte bytes-ähnliches Objekt und ergibt bytes-Objekt
main.py
print()
__import__( "sys" ).stdout.buffer.write\
( __import__( 'base64' ).b64decode( '>=?UTF-8?B?UmU6IGl0J3MgYWJvdXQgIOKApmluZw==?='[ 10: ]))print()
- Protokoll
Re: it's about …ing
main.py
print()
__import__( "sys" ).stdout.buffer.write\
( __import__( 'base64' ).b64decode( '=?UTF-8?B?SGFhcnfDpHNjaGU=?='[ 10: ]))print()
- Protokoll
Re: it's about …ing
main.py
print()
__import__( "sys" ).stdout.buffer.write\
( __import__( 'base64' ).b64decode( '=?UTF-8?B?bWVhbmluZyBvZiDigJxkaWdlc3TigJ0=?='[ 9: ]))print()
- Protokoll
meaning of “digest”
main.py
print()
__import__( "sys" ).stdout.buffer.write\
( __import__( 'base64' ).b64decode( 'RnJvbSB0aGlzIHllYXIsIHRoZSBudW1iZXIgb2Yg5pWZ6IKy5ryi5a2XIGhhcyBiZWVuIGluY3JlYXNlZCBmcm9tDQoxLDAwNiB0byAxLDAyNi4gSSBoYXZlIGp1c3QgdXBkYXRlZCB0aGUgZGF0YWJhc2UgZHJpdmluZyB0aGUgS2FuamlkaWMNCnZlcnNpb25zIHRvIHJlZmxlY3QgdGhlIGNoYW5nZXMuDQoNCjIwIGFkZGl0aW9uYWwga2FuamkgaGF2ZSBiZWVuIGFkZGVkIHRvIHRoZSA0dGggZ3JhZGUgc2V0Og0K6Iyo44CB5aqb44CB5bKh44CB5r2f44CB5bKQ44CB54aK44CB6aaZ44CB5L2Q44CB5Z+844CB5bSO44CB5ruL44CB6bm/44CB57iE44CB5LqV44CB5rKW44CB5qCD44CB5aWI44CB5qKo44CB6Ziq44CB6ZicDQooYXMgY2FuIGJlIHNlZW4sIG1hbnkgb2NjdXIgaW4gdGhlIG5hbWVzIG9mIHByZWZlY3R1cmVzLikNCg0KMzIga2FuamkgaGF2ZSBiZWVuIG1vdmVkIGJldHdlZW4gZ3JhZGVzOg0KLSDog4MgYW5kIOiFuCBoYXZlIGJlZW4gbW92ZWQgZnJvbSBncmFkZSA0IHRvIGdyYWRlIDYuDQotIOWBnCDlj7Ig5ZGKIOWWnCDlm7Ig5Z6LIOWggiDlo6sg5b6XIOaVkSDmrbQg5q66IOavkiDnsokg57SAIOiEiCDoiKog6LGhIOiyryDosrsg6LOeIGhhdmUNCmJlZW4gbW92ZWQgZnJvbSBncmFkZSA0IHRvIGdyYWRlIDUNCi0g5L+1IOWIuCDmgakg5om/IOaVtSDoiIwg6YCAIOmKrSDpoJAgaGF2ZSBiZWVuIG1vdmVkIGZyb20gZ3JhZGUgNSB0byBncmFkZSA2Lg0KDQpKaW0NCg0KW1NlZTogaHR0cHM6Ly9qYS53aWtpcGVkaWEub3JnL3dpa2kvJUU2JTk1JTk5JUU4JTgyJUIyJUU2JUJDJUEyJUU1JUFEJTk3XQ=='))print()
- Protokoll
From this year, the number of 教育漢字 has been increased from
1,006 to 1,026. I have just updated the database driving the Kanjidic
versions to reflect the changes.
20 additional kanji have been added to the 4th grade set:
茨、媛、岡、潟、岐、熊、香、佐、埼、崎、滋、鹿、縄、井、沖、栃、奈、梨、阪、阜
(as can be seen, many occur in the names of prefectures.)
32 kanji have been moved between grades:
- 胃 and 腸 have been moved from grade 4 to grade 6.
- 停 史 告 喜 囲 型 堂 士 得 救 歴 殺 毒 粉 紀 脈 航 象 貯 費 賞 have
been moved from grade 4 to grade 5
- 俵 券 恩 承 敵 舌 退 銭 預 have been moved from grade 5 to grade 6.
Jim
[See: https://ja.wikipedia.org/wiki/%E6%95%99%E8%82%B2%E6%BC%A2%E5%AD%97]main.py
print()
__import__( "sys" ).stdout.buffer.write\
( __import__( 'base64' ).b64decode( '=?UTF-8?B?UmU6IFvKjl0gZXQgW8myXSBlbiBmcmFuw6dhaXM=?='[ 10: ]))print()
- Protokoll
- Re: [ʎ] et [ɲ] en français
....
main.py
import quopri
# Content-Transfer-Encoding: quoted-printable
# Content-Type: text/plain; charset="UTF-8"
print( quopri.decodestring('=C2=A3').decode('utf-8') )
- Protokoll
£
main.py
import quopri
# Content-Transfer-Encoding: quoted-printable
# Content-Type: text/plain; charset="UTF-8"
print( quopri.decodestring('=?UTF-8?Q?...='[ 10: ]).decode('utf-8') )
- Protokoll
Kodieren in qp(UTF-8)
main.py
import quopri
print( quopri.encodestring(chr(168).encode('utf-8')) )
- Protokoll
b'=C2=A8'
Dekodieren von qp(UTF-8)
main.py
__import__( "sys" ).stdout.buffer.write\
( __import__( 'quopri' ).decodestring( \'''
>When you print the variable =E2=80=9Ca=E2=80=9D it appears as True, but the program is it is not getting in the if a=3D=3DTrue:
'''
))- Protokoll
>When you print the variable “a” it appears as True, but the program is it is not getting in the if a==True:
Header (Dekodieren von iso-8859-1/base64)
print( __import__( 'base64' ).b64decode( source_str ).decode( 'iso-8859-1' ))
Subject: Re: =?UTF-8?B?SG93IHRvIHJlcGxhY2Ugc3BhY2UgaW4gYSBzdHJpbmcgd2l0aCBcbg==?=
main.py
print( __import__( 'base64' ).b64decode( '=SG93IHRvIHJlcGxhY2Ugc3BhY2UgaW4gYSBzdHJpbmcgd2l0aCBcbg==?=' ).decode( 'utf-8' ))
- Protokoll
How to replace space in a string with \n
Hex-Codes
main.py
print( ''.join( [ f'{x:02x} ' for x in b"example" ]))
- Protokoll
65 78 61 6d 70 6c 65
main.py
print( ''.join( [ f'{x:02x} ' for x in b"example" ]))
print( bytes.fromhex( ''.join( [ f'{x:02x} ' for x in b"example" ])))
print( bytes.fromhex( ''.join( [ f'{x:02x} ' for x in "äöüß".encode( 'utf-8' ) ])))
- Protokoll
65 78 61 6d 70 6c 65
b'example'
Transkodieren
Dekodieren von iso-8859-1 in UTF-8s
city.decode('cp1252').encode('utf-8')
HTML
import html
s = html.unescape( s )
main.py
# encoding: utf-8
print( __import__( 'html.entities' ).entities.codepoint2name[ ord( 'é' )])
- transcript
eacute
main.py
'année'.encode(encoding='ascii',errors='xmlcharrefreplace').decode('ascii')
- transcript
'année'
Sonstiges
main.py
script = '''
Import os
Testdef Test():
print(“test”)
os.system(“pause”)
'''exec\
( script.replace( '“', '"' ).\
replace( '”', '"' ).\
replace( 'I', 'i' ).\
replace( 'Test\n', '' ).\
replace( 'pr', ' pr' ).\
replace( 'os.', ' os.' ) + \
'\nTest()' )- Protokoll
Press any key to continue . . .
test
main.py
# coding=cp1252
# coding of the source file specified above for u umlaut (todo: convert VBA to use UTF-8)
print( chr( 0x0041 ))
print( f"0x{ord( 'ü' ):04x}" )
print( bytes.fromhex('0040, 0041, 0042, 0043'.replace( ',', '' )).decode( 'utf-8' ))
print( list( b'ABC' ))
print( b'ABC'.hex() )
- Protokoll
A
0x00fc
@ABC
[65, 66, 67]
414243
main.py
# print( [ chr(int(__import__('ast').literal_eval('0x'+x.strip()))) for x in "231E, 231F, 231C, 231D".split(',') ])
print( flush=True, end='' )
__import__( "sys" ).stdout.buffer.write\
( chr( 0x231E ).encode( 'utf-8' ))
print( flush=True, end='' )
- Protokoll
main.py
print( __import__( "sysconfig" ).get_python_version() )
print( __import__( "sys" ).version.split()[ 0 ])
print( 'quoting the documentation (3.6 - 3.8): ' )
print( 'chr(i)' )
print( 'Return the string representing a character whose Unicode code point is the integer i.' )
# print( chr( 0x231E )) # Unicode Character 'BOTTOM LEFT CORNER' (U+231E)
print( flush=True, end='' )
__import__( "sys" ).stdout.buffer.write( chr( 0x231E ).encode( 'utf-8' ))
print( flush=True, end='' )
print( 'Test' )
print( flush=True, end='' )
__import__( "sys" ).stdout.buffer.write( chr( 0x231E ).encode( 'utf-8' ))
print( flush=True, end='' )
print( 'Test' )
print( flush=True, end='' )
__import__( "sys" ).stdout.buffer.write( chr( 0x231E ).encode( 'utf-8' ))
print( flush=True, end='' )
print( 'Test' )
print( flush=True, end='' )
__import__( "sys" ).stdout.buffer.write( chr( 0x231E ).encode( 'utf-8' ))
print( flush=True, end='' )
print( 'Test' )
- Protokoll
3.7
3.7.0
quoting the documentation (3.6 - 3.8):
chr(i)
Return the string representing a character whose Unicode code point is the integer i.
{$c8990}Test
{$c8990}Test
{$c8990}Test
{$c8990}Test
main.py
print( bin( 100 ))
print( 0b1100100 )
print( hex( 100 ))
print( 0x64 )
print( oct( 100 ))
print( 0o144 )
print( chr( 65 ))
print( ord( 'A' ))
- Protokoll
0b1100100
100
0x64
100
0o144
100
A
65
Notizen
Newsgroups: comp.lang.python,comp.lang.javascript
Here's a small comparing tutorial about URI encoding
and how to use it for repairing mis-encoded strings.
Code examples are given for both JavaScript ("JS") and
Python ("py").
Section 1: Preparations
In Python, we need two import's:
py> from urllib.parse import quote
py> from urllib.parse import unquote
Section 2: One-pass operations
convert a character to its ISO 8859-1 representation and
then represent this in URI notation
JS> escape( 'ä' )
JS> "%E4"
py> quote( 'ä', encoding='iso8859-1' )
py> '%E4'
convert a character to its UTF-8 representation and then
represent this in URI notation
JS> encodeURI( 'ä' )
JS> "%C3%A4"
py> quote( 'ä', encoding='utf-8' )
py> '%C3%A4'
get the character from URI notation assuming it was
encoded using its ISO 8859-1 representation
JS> unescape( '%E4' )
JS> "ä"
py> unquote( '%E4', encoding='iso8859-1' )
py> 'ä'
unquote( '%E4', encoding='iso8859-1' )
get the character from URI notation assuming it was
encoded using its UTF-8 representation
JS> decodeURIComponent( '%C3%A4' )
JS> "ä"
py> unquote( '%C3%A4', encoding='utf-8' )
py> 'ä'
Section 2: Repairing a misencoded character sequence
Sometimes people decode UTF-8 as if it was ISO 8859-1.
This results in ugly strings like »Ã¤« (for »ä«).
JS> unescape( encodeURI( 'ä' ))
JS> "ä"
py> unquote( quote( 'ä', encoding='utf-8' ), encoding='iso8859-1' )
py> 'ä'
But with JavaScript or Python we can repair such strings!
We first URI-encode them to get a kind of octet sequence.
JS> escape( 'ä' )
JS> "%C3%A4"
py> quote( 'ä', encoding='iso8859-1' )
py> '%C3%A4'
Now we can decode this octet sequence using the correct
encoding!
JS> decodeURIComponent( escape( 'ä' ))
JS> "ä"
py> unquote( quote( 'ä', encoding='iso8859-1' ), encoding='utf-8' )
py> 'ä'
Summary
Both Python an JavaScript allow to URI-encode ISO 8859-1
characters using either ISO 8859-1 or UTF-8. They also allow
to decode them again. The details of the function calls
differ somewhat. The explicit mentioning of the encoding in
the Python calls makes them more orthogonal (readable)
than the JavaScript names who do not mention the encodings.
PS:
In JavaScript, there is a difference between "encodeURI" and
"encodeURIComponent" that might be represented in Python as:
encodeURI( s ) --> urllib.parse.quote(s, safe='~@#$&()*!+=:;,.?/\'');
encodeURIComponent( s ) --> urllib.parse.quote(s, safe='~()*!.\'')
.