Skip to main content
The 2025 Developer Survey results are in. Explore insights into technology and tools, careers, community and more. View results.
25 votes
Accepted

Transcode UCS-4BE to UTF-8

Efficient file I/O By default, files opened with fopen() are buffered, meaning that not every call to fread() or ...
G. Sliepen's user avatar
  • 68.1k
24 votes

Transcode UCS-4BE to UTF-8

log is already declared in <math.h>. You don't need to declare it yourself. In fact, it could be harmful. As stated in ...
vnp's user avatar
  • 58.4k
13 votes

Transcode UCS-4BE to UTF-8

This program reads 4 byte codepoints (in BIG ENDIAN) from a file strictly called "input.data" and creates another file called "ENCODED.data" with the relative encoding in UTF8. Needless to say, ...
Maarten Bodewes's user avatar
12 votes
Accepted

Printing Command Line Unicode Chess Board

numpy In this case there is no need to use. For an 8 by 8 board, filled with strings, there is no advantage to using it, apart from the possibility to index row and column at the same time enums You ...
Maarten Fabré's user avatar
11 votes
Accepted

Subclassed Python Counter for a more visually user-friendly __str__() method, for anagrams

1. Review There is no docstring. What kind of object is a LetterCounter? The Python style guide recommends restricting lines to 79 characters. If you did this, then ...
Gareth Rees's user avatar
10 votes
Accepted

A String View Library in C

Only include the header files that are required: #include <stdbool.h> #include <stddef.h> #include <stdio.h> It is useless to include both ...
Madagascar's user avatar
  • 10.1k
9 votes

Transcode UCS-4BE to UTF-8

regarding: ptr = fopen("input.data", "rb"); out = fopen("ENCODED.data", "wb"); always check (!=NULL) the returned value to assure the operation was successful....
user3629249's user avatar
  • 2,908
9 votes

Transcode UCS-4BE to UTF-8

Code fails to detect invalid code points There are 1,112,064 valid unicode code points, not 232. The valid range is [0x0 - 0x10FFFF] except the sub-range of [0xD800 - 0xDFFF]. This later sub-range ...
chux's user avatar
  • 36.2k
8 votes

Pretty printing of the numpy ndarrays

If A.ndim is not in 1, 2, 3, your code tries to return a non-existing string s. It would be ...
Graipher's user avatar
  • 41.5k
8 votes

Transcode UCS-4BE to UTF-8

As others have said, don't use floating point math, but in some sense that's reviewing the wrong layer. The real issue behind that is that you don't need to be branching on a derived quantity, the ...
R.. GitHub STOP HELPING ICE's user avatar
8 votes

UTF-8 to UTF-16 (char8_t string to char16_t string)

Good job on general organization: use of namespace and nested detail namespace, use of different sized char types, marking things with noexcept, etc. Error: Why do you return ...
JDługosz's user avatar
  • 11.7k
7 votes

Printing Command Line Unicode Chess Board

Piece dictionary chrs is a very generic name. Since this is all about chess, you should be more specific by calling it something like ...
AlexV's user avatar
  • 7,323
7 votes
Accepted

Universal string conversion

First, you should not be using Python 2 anymore, if at all possible. It will reach end of support at the end of the year. In Python 3 your code would just be: ...
Graipher's user avatar
  • 41.5k
7 votes
Accepted

Convert UTF8 string to UTF32 string in C

There are two main things to talk about: checking the input, and buffer handling (your malloc question). It's a very bad idea to do things like ...
Raziel's user avatar
  • 118
7 votes

JavaScript string to Unicode (Hex)

This is regarding the edge-cases and test cases mentioned in the question: [...characters] // or Array.from(characters) handles splitting the characters of string ...
adiga's user avatar
  • 513
7 votes

Client server communications through unix signals in C

Why Unicode works The reason why unicode works is that UTF-8 encoding uses one or more chars per unicode character. Thus, UTF-8 encoded strings fit into a C string, ...
G. Sliepen's user avatar
  • 68.1k
6 votes

Convert UTF8 string to UTF32 string in C

Magic numbers The implementation uses them a lot. While the bit-notation helps with indicating what is happening, it doesn't show the intention. What reads clearer: ...
hoffmale's user avatar
  • 6,528
6 votes
Accepted

Game of Life improvement

I'm unsure what you mean by saying "improve the user interface". AFAIK Game of life doesn't have much of a user interface as it's a simulation that only displays the grid and whatever happens on it. ...
yuri's user avatar
  • 4,518
6 votes

Printing Command Line Unicode Chess Board

Python-specific improvements: creating bw_checkers (in get_checkers function). Instead of appending repeatedly to previously ...
RomanPerekhrest's user avatar
6 votes

UTF-8 to UTF-16 (char8_t string to char16_t string)

Adding to JDługosz's comments: Rename utf8_to_utf16() to convert() Why? Because the function's argument types already restrict ...
G. Sliepen's user avatar
  • 68.1k
6 votes

UTF-8 to UTF-16 (char8_t string to char16_t string)

Re-implementing this kind of low-level conversion seems a waste of effort when C++ goes to some length to provide these facilities for us: ...
Toby Speight's user avatar
  • 86.3k
5 votes
Accepted

Preliminary draft of a data structure for a goodies tracker

It's weird that your get_capacity() returns a string. I'm guessing that it's to facilitate string concatenation in the print() ...
200_success's user avatar
5 votes
Accepted

UTF-8 to UTF-16 (char8_t string to char16_t string)

suggested architecture ① decode UTF-8 stream Construct with a range of UTF-8 encoded bytes, present the object as an iterator that provides complete 32-bit characters. It has another member function ...
JDługosz's user avatar
  • 11.7k
5 votes

Unicode transfer format conversions as range adapters

The input range mystery I’ve been planning to write this review for a few days now. The thing that’s held me back is that there is a mystery I haven’t been able to solve: why you were unable to use <...
indi's user avatar
  • 16k
4 votes

Convert accented character to user name

PHP supports ICU via ext/intl and this includes transliteration. ...
ThW's user avatar
  • 151
4 votes

Convert UTF8 string to UTF32 string in C

Welcome back to C Binary constants are not part of standard C - yet // 0b10000000 0x80 Lack of error detection i += 2;, <...
chux's user avatar
  • 36.2k
4 votes

Convert UTF8 string to UTF32 string in C

Portability In calculations like ...
Martin R's user avatar
  • 24k
4 votes
Accepted

Convert to UTF-8 all files in a directory

The code in the post uses the chardet library to determine the encoding of the file, but then the only use it makes of that information is to decide whether or not ...
Gareth Rees's user avatar
4 votes

Filter out strings that are short or that are outside the alphabet of a language

It took me a while to decipher what you meant with 'less than n values'. What your function actually does: it filters out all words that are either shorter than the specified length (...
Pieter Witvoet's user avatar
4 votes
Accepted

Normalize a GSM short message text

1. Review Normally, a class is used to implement a collection of objects with similar behaviour, and an object is used to implement some kind of persistent thing or data structure. But what kind of ...
Gareth Rees's user avatar

Only top scored, non community-wiki answers of a minimum length are eligible