Test if string contains only letters (a-z + é ü ö ê å ø etc..)

Posted on

Test if string contains only letters (a-z + é ü ö ê å ø etc..) – Even if we have a good project plan and a logical concept, we will spend the majority of our time correcting errors abaout javascript and regex. Furthermore, our application can run without obvious errors with JavaScript, we must use various ways to ensure that everything is operating properly. In general, there are two types of errors that you’ll encounter while doing something wrong in code: Syntax Errors and Logic Errors. To make bug fixing easier, every JavaScript error is captured with a full stack trace and the specific line of source code marked. To assist you in resolving the JavaScript error, look at the discuss below to fix problem about Test if string contains only letters (a-z + é ü ö ê å ø etc..).

Problem :

I want to match a string to make sure it contains only letters.

I’ve got this and it works just fine:

var onlyLetters = /^[a-zA-Z]*$/.test(myString);


Since I speak another language too, I need to allow all letters, not just A-Z. Also for example:

é ü ö ê å ø

does anyone know if there is a global 'alpha' term that includes all letters to use with regExp? Or even better, does anyone have some kind of solution?

Thanks alot

Just realized that you might also wanna allow ‘-‘ and ‘ ‘ incase of a double name like: ‘Mary-Ann’ or ‘Mary Ann’

Solution :

I don’t know the actual reason for doing this, but if you want to use it as a pre-check for, say, login names oder user nicknames, I’d suggest you enter the characters yourself and don’t use the whole ‘alpha’ characters you’ll find in unicode, because you probably won’t find an optical difference in the following letters:

А ≠ A ≠ Α  # cyrillic, latin, greek

In such cases it’s better to specify the allowed letters manually if you want to minimise account faking and such.


Well, if it’s for a field which is supposed to be non-unique, I would allow greek as well. I wouldn’t feel well when I force users into changing their name to a latinised version.

But for unique fields like nicknames you need to give your other visitors of the site a hint, that it’s really the nickname they think it is. Bad enough that people will fake accounts with interchanging I and l already. Of course, it’s something that depends on your users; but to be sure I think it’s better to allow basic latin + diacritics only. (Maybe have a look at this list: Latin-derived_alphabet)

As an untested suggestion (with ‘-’, ‘_’ and ‘ ’):

/^[a-zA-Z-_ ’'‘ÆÐƎƏƐƔIJŊŒẞÞǷȜæðǝəɛɣijŋœĸſßþƿȝĄƁÇĐƊĘĦĮƘŁØƠŞȘŢȚŦŲƯY̨Ƴąɓçđɗęħįƙłøơşșţțŧųưy̨ƴÁÀÂÄǍĂĀÃÅǺĄÆǼǢƁĆĊĈČÇĎḌĐƊÐÉÈĖÊËĚĔĒĘẸƎƏƐĠĜǦĞĢƔáàâäǎăāãåǻąæǽǣɓćċĉčçďḍđɗðéèėêëěĕēęẹǝəɛġĝǧğģɣĤḤĦIÍÌİÎÏǏĬĪĨĮỊIJĴĶƘĹĻŁĽĿʼNŃN̈ŇÑŅŊÓÒÔÖǑŎŌÕŐỌØǾƠŒĥḥħıíìiîïǐĭīĩįịijĵķƙĸĺļłľŀʼnńn̈ňñņŋóòôöǒŏōõőọøǿơœŔŘŖŚŜŠŞȘṢẞŤŢṬŦÞÚÙÛÜǓŬŪŨŰŮŲỤƯẂẀŴẄǷÝỲŶŸȲỸƳŹŻŽẒŕřŗſśŝšşșṣßťţṭŧþúùûüǔŭūũűůųụưẃẁŵẅƿýỳŷÿȳỹƴźżžẓ]$/.test(myString)

Another edit:
I have added the apostrophe for people with names like O’Neill or O’Reilly. (And the straight and the reversed apostrophe for people who can’t enter the curly one correctly.)

var onlyLetters = /^[a-zA-Zu00C0-u00ff]+$/.test(myString)

You can’t do this in JS. It has a very limited regex and normalizer support. You would need to construct a lengthy and unmaintainable character array with all possible latin characters with diacritical marks (I guess there are around 500 different ones). Rather delegate the validation task to the server side which uses another language with more regex capabilties, if necessary with help of ajax.

In a full fledged regex environment you could just test if the string matches p{L}+. Here’s a Java example:

boolean valid = string.matches("\p{L}+");

Alternatively, you could also normailze the text to get rid of the diacritical marks and check if it contains [A-Za-z]+ only. Here’s again a Java example:

string = Normalizer.normalize(string, Form.NFD).replaceAll("\p{InCombiningDiacriticalMarks}+", "");
boolean valid = string.matches("[A-Za-z]+");

PHP supports similar functions.

When I tried to implement @Debilski’s solution JavaScript didn’t like the extended Latin characters — I had to code them as JavaScript escapes:

// The huge unicode escape string is equal to ÆÐƎƏƐƔIJŊŒẞÞǷȜæðǝəɛɣijŋœĸſßþƿȝĄƁÇĐƊĘĦ
// ƏƐĠĜǦĞĢƔáàâäǎăāãåǻąæǽǣɓćċĉčçďḍđɗðéèėêëěĕēęẹǝəɛġĝǧğģɣĤḤĦIÍÌİÎÏǏĬĪĨĮỊ
// IJĴĶƘĹĻŁĽĿʼNŃN̈ŇÑŅŊÓÒÔÖǑŎŌÕŐỌØǾƠŒĥḥħıíìiîïǐĭīĩįịijĵķƙĸĺļłľŀʼnńn̈ňñ
// ŧþúùûüǔŭūũűůųụưẃẁŵẅƿýỳŷÿȳỹƴźżžẓ

function isAlpha(string) {
    var patt = /^[a-zA-Zu00C6u00D0u018Eu018Fu0190u0194u0132u014Au0152u1E9Eu00DEu01F7u021Cu00E6u00F0u01DDu0259u025Bu0263u0133u014Bu0153u0138u017Fu00DFu00FEu01BFu021Du0104u0181u00C7u0110u018Au0118u0126u012Eu0198u0141u00D8u01A0u015Eu0218u0162u021Au0166u0172u01AFYu0328u01B3u0105u0253u00E7u0111u0257u0119u0127u012Fu0199u0142u00F8u01A1u015Fu0219u0163u021Bu0167u0173u01B0yu0328u01B4u00C1u00C0u00C2u00C4u01CDu0102u0100u00C3u00C5u01FAu0104u00C6u01FCu01E2u0181u0106u010Au0108u010Cu00C7u010Eu1E0Cu0110u018Au00D0u00C9u00C8u0116u00CAu00CBu011Au0114u0112u0118u1EB8u018Eu018Fu0190u0120u011Cu01E6u011Eu0122u0194u00E1u00E0u00E2u00E4u01CEu0103u0101u00E3u00E5u01FBu0105u00E6u01FDu01E3u0253u0107u010Bu0109u010Du00E7u010Fu1E0Du0111u0257u00F0u00E9u00E8u0117u00EAu00EBu011Bu0115u0113u0119u1EB9u01DDu0259u025Bu0121u011Du01E7u011Fu0123u0263u0124u1E24u0126Iu00CDu00CCu0130u00CEu00CFu01CFu012Cu012Au0128u012Eu1ECAu0132u0134u0136u0198u0139u013Bu0141u013Du013Fu02BCNu0143Nu0308u0147u00D1u0145u014Au00D3u00D2u00D4u00D6u01D1u014Eu014Cu00D5u0150u1ECCu00D8u01FEu01A0u0152u0125u1E25u0127u0131u00EDu00ECiu00EEu00EFu01D0u012Du012Bu0129u012Fu1ECBu0133u0135u0137u0199u0138u013Au013Cu0142u013Eu0140u0149u0144nu0308u0148u00F1u0146u014Bu00F3u00F2u00F4u00F6u01D2u014Fu014Du00F5u0151u1ECDu00F8u01FFu01A1u0153u0154u0158u0156u015Au015Cu0160u015Eu0218u1E62u1E9Eu0164u0162u1E6Cu0166u00DEu00DAu00D9u00DBu00DCu01D3u016Cu016Au0168u0170u016Eu0172u1EE4u01AFu1E82u1E80u0174u1E84u01F7u00DDu1EF2u0176u0178u0232u1EF8u01B3u0179u017Bu017Du1E92u0155u0159u0157u017Fu015Bu015Du0161u015Fu0219u1E63u00DFu0165u0163u1E6Du0167u00FEu00FAu00F9u00FBu00FCu01D4u016Du016Bu0169u0171u016Fu0173u1EE5u01B0u1E83u1E81u0175u1E85u01BFu00FDu1EF3u0177u00FFu0233u1EF9u01B4u017Au017Cu017Eu1E93]+$/;
    return patt.test(string);

There should be, but the regex will be localization dependent. Thus, é ü ö ê å ø won’t be filtered if you’re on a US localization, for example. To ensure your web site does what you want across all localizations, you should explicitly write out the characters in a form similar to what you are already doing.

The only standard one I am aware of though is w, which would match all alphanumeric characters. You could do it the “standard” way by running two regex, one to verify w matches and another to verify that d (all digits) does not match, which would result in a guaranteed alpha-only string. Again, I’d strongly urge you not to use this technique as there’s no guarantee what w will represent in a given localization, but this does answer your question.

This can be tricky, unfortunately JavaScript has pretty poor support for internationalization. To do this check you’ll have to create your own character class. This is because for instance, w is the same as [0-9A-Z_a-z] which won’t help you much and there isn’t anything like [[:alpha:]] in Javascript. But since it sounds like you’re only going to use one other langauge you can probably just add those other characters into your character class.

By the way, I think you’ll need a ? or * in your regexp there if myString can be longer than one character.

The full example,


I don’t know anything about Javascript, but if it has proper unicode support, convert your string to a decomposed form, then remove the diacritics from it ([u0300-u036fu1dc0-u1dff]). Then your letters will only be ASCII ones.

You could aways use a blacklist instead of a whitelist. That way you only remove the characters you do not need.

You could use a blacklist – a list of characters to exclude.

Also, it is important to verify the input on server-side, not only on client-side! Client-side can be bypassed easily.

There are some shortcuts to achive this in other regular expression dialects – see this page. But I don’t believe there are any standardised ones in JavaScript – certainly not that would be supported by all browsers.

function noExtendedChars( input_name ){

    var whitelist = [
        ['a',  'à','á','â','ä','æ','ã','å','ā'],
        ['c',  'ç', 'ć', 'č'],
        ['e',  'è','é','ê','ë','ē','ė','ę'],
        ['i',  'ï','ï','í','ī','į','î'],
        ['l',  'ł'],
        ['n',  'ñ', 'ń'],
        ['o',  'ô', 'ö', 'ò', 'ó', 'œ', 'ø', 'ō', 'õ' ],
        ['s',  'ß', 'ś', 'š' ],
        ['u',  'û', 'ü', 'ù', 'ú', 'ū'],
        ['y',  'ÿ'],
        ['z',  'ž', 'ź', 'ż']

    for( b=0; b < blacklist.length; b++ ){
        var r=  blacklist[b];
        for ( a=1; a < r.length; a++ ){
            input_name = input_name.replace( new RegExp( r[a], "gi") , r[0]);
    return input_name;

var regexp = /B#[a-zA-Zx7f-xff]+/g; 
var result = searchText.match(regexp);

Leave a Reply

Your email address will not be published. Required fields are marked *