Windows-1252

Windows-1252 or CP-1252 (code page 1252) is a single-byte character encoding of the Latin alphabet that was used by default in Microsoft Windows for English and many Romance and Germanic languages including Spanish, Portuguese, French, and German (though missing uppercase ẞ).

This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa.

Windows-1252
Windows-1252
MIME / IANAwindows-1252
Alias(es)cp1252 (code page 1252)
Language(s)All supported by ISO/IEC 8859-1 plus full support for French and Finnish and ligature forms for English; e.g. Danish (except for a rare exceptional letter), Irish, Italian, Norwegian, Portuguese, Spanish, Swedish, German (missing uppercase ), Icelandic, Faroese, Luxembourgish, Albanian, Estonian, Swahili, Tswana, Catalan, Basque, Occitan, Rotokas, Toki Pona, Lojban, Romansh, Dutch (except the IJ/ij character, substituted by IJ/ij or ÿ), and Slovene (except the č character, substituted by ç).
Created byMicrosoft
StandardWHATWG Encoding Standard
Classificationextended ASCII, Windows-125x
ExtendsISO 8859-1 (excluding C1 controls)
Transforms / EncodesISO 8859-15

It is the most-used single-byte character encoding in the world. As of April 2024, 1.2% of all web sites declare ISO 8859-1 which is treated as Windows-1252 by all modern browsers (as demanded by the HTML5 standard), plus 0.3% of all websites declared use of Windows-1252, for a total of 1.5% (and only 15 of the top 1000 websites).

Depending on the country or language, in 2024, use (on websites at least) can be much higher than the global average, e.g. (including Windows-1252), for Brazil according to website use, use is at 3.8%, and in Germany at 2.8%. (these are the sums of ISO-8859-1 and CP-1252 declarations).

Details

This character encoding is a superset of ISO 8859-1 in terms of printable characters, but differs from the IANA's ISO-8859-1 by adding additional characters in the 0x80 to 0x9F (hex) range (the ISO standards reserve this range for C1 control codes). Notable additional characters include curly quotation marks and all printable characters from ISO 8859-15. It is known to Windows by the code page number 1252, and by the IANA-approved name "windows-1252".

Starting in the 1990s, many Microsoft products that could produce HTML included Windows-1252-exclusive characters, but marked the encoding as ISO-8859-1, ASCII, or undeclared.[citation needed] Characters exclusive to Windows-1252 would often render incorrectly on non-Windows operating systems (often as question marks, blanks, or boxes). In particular, typographers' quotes — curly variants of the standard straight apostrophes and quotation marks in US-ASCII — were commonly used in files produced in Windows applications such as Microsoft Word due to the smart quotes feature, which can automatically convert straight apostrophes and quotation marks to the curly variants. To fix this, by 2000 most web browsers and e-mail clients treated the charsets ISO-8859-1 and US-ASCII as Windows-1252[citation needed] — this behavior is now required by the HTML5 specification. Undeclared charsets in HTML are also assumed to be Windows-1252.

Historically, the phrase "ANSI Code Page" was used in Windows to refer to non-DOS encodings; the intention was that most of these would be ANSI standards such as ISO-8859-1. Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard. Microsoft explains, "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."

In LaTeX packages, CP-1252 is referred to as "ansinew".

IBM uses code page 1252 (CCSID 1252 and euro sign extended CCSID 5348) for Windows-1252.

It is called "WE8MSWIN1252" by Oracle.

Codepage layout

The following table shows Windows-1252. Differences from ISO-8859-1 have the Unicode code point number below the character, based on the Unicode.org mapping of Windows-1252 with "best fit". A tooltip, generally available only when one points to the immediate left of the character, shows the Unicode code point name and the decimal Alt code.

Windows-1252 (CP1252)
0 1 2 3 4 5 6 7 8 9 A B C D E F
0_ NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1_ DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2_  SP  ! " # $ % & ' ( ) * + , - . /
3_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4_ @ A B C D E F G H I J K L M N O
5_ P Q R S T U V W X Y Z [ \ ] ^ _
6_ ` a b c d e f g h i j k l m n o
7_ p q r s t u v w x y z { | } ~ DEL
8_
20AC

201A
ƒ
0192

201E

2026

2020

2021
ˆ
02C6

2030
Š
0160

2039
Œ
0152
Ž
017D
9_
2018

2019

201C

201D

2022

2013

2014
˜
02DC

2122
š
0161

203A
œ
0153
ž
017E
Ÿ
0178
A_ NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
B_ ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
C_ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
D_ Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
E_ à á â ã ä å æ ç è é ê ë ì í î ï
F_ ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

  According to the information on Microsoft's and the Unicode Consortium's websites, positions 81, 8D, 8F, 90, and 9D are unused; however, the Windows API MultiByteToWideChar maps these to the corresponding C1 control codes. The "best fit" mapping documents this behavior, too.

History

  • The first version[when?] of the codepage 1252 used in Microsoft Windows 1.0 did not have positions D7 and F7 defined. All the characters in the ranges 80–9F were undefined too.
  • The second version, used in Microsoft Windows 2.0, positions D7, F7, 91, and 92 had been defined.
  • The third version, used since Microsoft Windows 3.1, had all the present-day positions defined, except euro sign and Z with caron character pair.
  • The final version listed above debuted in Microsoft Windows 98 and was ported to older versions of Windows with the euro symbol update.

OS/2 extensions

The OS/2 operating system supports an encoding by the name of Code page 1004 (CCSID 1004) or "Windows Extended". This mostly matches code page 1252, with the exception of certain C0 control characters being replaced by diacritic characters.

Code page 1004 (differing rows only)
0 1 2 3 4 5 6 7 8 9 A B C D E F
0_ NUL SOH STX ETX ˉ
02C9
˘
02D8
˙
02D9
BEL ˚
02DA
HT ˝
02DD
˛
02DB
ˇ
02C7
CR SO SI

MSDOS extensions [rare]

There is a rarely used, but useful, graphics extended code page 1252 where codes 0x00 to 0x1f allow for box drawing as used in applications such as MSDOS Edit and Codeview. One of the applications to use this code page was an Intel Corporation Install/Recovery disk image utility from mid/late 1995. These programs were written for its P6 User Test Program machines (US example). It was used exclusively in its then EMEA region (Europe, Middle East & Africa). In time the programs were changed to use code page 850.

Graphics Extended Code Page 1252[citation needed]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0_
1_

Palm OS variant

Each Palm OS device supports a single language and a single character encoding, depending on its locale.

For languages such as English and French, Palm OS uses a custom character encoding based on Windows-1252. For Japanese, it instead uses a multibyte character encoding based on code page 932. Regardless of the system locale, all characters in the range 0x00 to 0x7F are guaranteed to be the same, except 0x5D which is the Yen sign in Japanese and a backslash on all others.

Palm OS 3.1 introduced several changes to the character encoding to better align with Windows-1252:

  • The special Palm OS glyphs "shortcut stroke" (0x9D) and "command stroke" (0x9E) were copied to 0x16 and 0x17, to ensure they were in the range guaranteed to be consistent between locales. Starting in Palm OS 3.3, 0x16 and 0x17 are the only code points for those characters, leaving 0x9D and 0x9E undefined.
  • The numeric space (0x80) and horizontal ellipsis (0x85) were copied to 0x19 and 0x18 (respectively), to ensure they were in the range guaranteed to be consistent between locales.
  • The Euro sign was added at 0x80, replacing what was previously the numeric space.
  • The playing card suits were copied to the font Symbol 9, although their original code points remain valid.

The following is the variant of Windows-1252 used by Palm OS 3.3 onward for English and several other locales. Python gives it the palmos label, describing it as the encoding for Palm OS 3.5. Differences from Windows-1252 have their Unicode code point.

Palm OS 3.3 character encoding
0 1 2 3 4 5 6 7 8 9 A B C D E F
8_ ƒ ˆ Š Œ
2666

2663

2665
9_
2660
˜  š œ Ÿ

See also

References

This article uses material from the Wikipedia English article Windows-1252, which is released under the Creative Commons Attribution-ShareAlike 3.0 license ("CC BY-SA 3.0"); additional terms may apply (view authors). Content is available under CC BY-SA 4.0 unless otherwise noted. Images, videos and audio are available under their respective licenses.
®Wikipedia is a registered trademark of the Wiki Foundation, Inc. Wiki English (DUHOCTRUNGQUOC.VN) is an independent company and has no affiliation with Wiki Foundation.

Tags:

Windows-1252 DetailsWindows-1252 Codepage layoutWindows-1252AfricaAmericasCharacter encodingCode pageGermanic languagesLatin alphabetMicrosoft WindowsOceaniaRomance languagesWestern Europe

🔥 Trending searches on Wiki English:

Sue BirdBukayo SakaDid You Know That There's a Tunnel Under Ocean BlvdAlbaniaSouth KoreaUkraineYugoslav coup d'étatRasmus HøjlundEnglandInterstellar (film)Aubrey PlazaShabana RazaHolly HolmBillie EilishCanadaList of South Park episodesAnsel AdamsHong KongLove Is Blind (TV series)Nikhat ZareenThe Mandalorian (season 3)Jeff GoldblumGregor MacGregorClancy BrownLisa Marie PresleyMarie AntoinetteOrange (2010 film)List of international goals scored by Cristiano RonaldoA Good PersonHayden PanettiereWindows 10 version historyCyprusColumbine High School massacreC (programming language)Isaac HerzogYouTubeSeven deadly sinsJada Pinkett SmithYouTube PremiumUnited States2023 French pension reform strikesJudy GarlandNicholas HoultXXXShah Rukh KhanBumpy JohnsonStephen HawkingSian BrookeJuliette LewisStevie NicksPornhubDavid (Michelangelo)2023 Miami Open – Women's singlesHard Rock (exercise)Elvis PresleyChris RockGrace Caroline CurreyInternet2020 United States presidential electionDwayne JohnsonThe MandalorianMicrosoft Office MixMichael LandonChristina Ricci2023 Karnataka Legislative Assembly electionTwitterDonnie YenSam BurnsDavid Mayer de RothschildLeBron JamesFranklin D. RooseveltIce SpiceAnnie LennoxIndian Premier League2023 Miami Open – Men's singlesThe BeatlesCovenant School shootingJoe Alwyn🡆 More