All Questions
7 questions
3
votes
3
answers
165
views
Filter out strings that are short or that are outside the alphabet of a language
I wrote this function to remove from the array those words that are less than n values. More precisely, I updated my old function and added some functionality. Since I am a beginner in this ...
10
votes
1
answer
570
views
Highlight specific words in a sentence with diacritrics
I am searching for some improvements, particularly in the regex, in the way I highlight specific words in a string.
I have keywords into my database stored without any diacritrics
The user comes with ...
15
votes
1
answer
60k
views
Getting data correctly from <span> tag with beautifulsoup and regex
I am scraping an online shop page, trying to get the price mentioned in that page. In the following block the price is mentioned:
...
8
votes
1
answer
678
views
dir="auto" JavaScript shim for IE
Reason for script:
dir="auto" is an attribute value from the HTML 5 spec with current poor support in IE and Opera browsers. The project I am working on only ...
4
votes
1
answer
921
views
Unicode parsing in PHP
Firstly, apologies if this is not the correct type of question for here, I had it on the stackoverflow but it was closed with a suggestion I post here.
I’m in the process of converting from Latin 15 ...
3
votes
1
answer
2k
views
Simplify regular expression? (Converting Unicode fractions to TeX)
Background
I'm converting Unicode text to TeX for typesetting. In the input, I'm allowing simple fractions like ½ and ⅔ using single Unicode characters and complex fractions like ¹²³/₄₅₆ using ...
7
votes
2
answers
1k
views
N-gram generation
I have the following code, but it's too slow. How can I make it faster?
...