Correctly applying case to Mac, Mc and Hyphenated names

Find out how to get things done with Servoy. Post how YOU get things done with Servoy

Correctly applying case to Mac, Mc and Hyphenated names

Postby antonio » Thu Apr 20, 2006 4:18 am

As a developer, I like to make things as easy and consistent as possible for data entry, so I make good use of
Code: Select all
location = location.toUpperCase();
family_name = utils.stringInitCap(family_name);
first_name = utils.stringInitCap(first_name);



I like to make my apps clever enough to sort out the correct case of things, eg if the user slips and enters ANtonio.

Many family names of British and Irish origin often incorrectly handled by InitCap (MacDowell, McKay, O'Connor etc). Neither are hyphenated names, which are common in many cultures.

Here's a method I call ProperCase, attached to the field family_name onDataChange property.

Code: Select all
// small method to correctly case names of Scottish and Irish origin, and hyphenated names.
// tested against 65000 common names from census data.
// only replace leading MC and MAC,

// add spaces so InitCap works
// need a space at the front for the MC step - doesn't affect SIMCOE, TOMCZAK, RAMCHARAN etc
var temp_name = " " + family_name.toUpperCase();
temp_name = utils.stringReplace(temp_name, "O'", "O' ");
temp_name = utils.stringReplace(temp_name, "-", "- ");
temp_name = utils.stringReplace(temp_name, " MC", " MC ");

// need to skip short names beginning with MAC, eg MACE, MACK, MACY, MAKIN, MACON, MACRI, MACEY etc
// ignore names with <=5 chars as there are no (common) names MacXx
// from census data and phone listing, almost all Mac* with >5 char are MacXx*
// we need a look up of Mac names > 5 that don't fit the rule. 
// this list could instead be in a related table, user-editable, to allow for locally common exceptions.
// could also change the code to optionally allow the formatting to be overridden.
var macExceptions = "MACABAGAL|MACADANGDANG|MACAK|MACARIO|MACARO|MACAW|MACCA|MACCAR|MACCARONE|MACCORA|MACHRI|MACHOSS|MACHOY|MACHUCA|MACIAK|MACIEL|MACISZEWSKI|MACIULATIS|MACKOJC|MACLIDES|MACUCUK|MACUT"
// ALL CASES OF MACCH* ARE Macch* HANDLED BELOW

var words = utils.stringWordCount(temp_name);
var temp_word = "";
for ( var i = 1 ; i <= words ; i++ )
{
   temp_word = utils.stringMiddleWords(temp_name, i, 1)
   if(utils.stringLeft(temp_word , 3) == "MAC" && temp_word.length > 5 && utils.stringLeft(temp_word , 5) != "MACCH"   && utils.stringPatternCount(macExceptions, temp_word) < 1)
   {
      temp_name = utils.stringReplace(temp_name, temp_word, "MAC " + utils.stringRight(temp_word, temp_word.length - 3));
      i++;
      words++;
   }
}
// apply InitCase
temp_name = " " + utils.stringInitCap(temp_name);
// strip added spaces
temp_name = utils.stringReplace(temp_name, "Mac ", "Mac");
temp_name = utils.stringReplace(temp_name, " Mc ", " Mc");
temp_name = utils.stringReplace(temp_name, "- ", "-");
temp_name = utils.stringReplace(temp_name, "O' ", "O'");

family_name = utils.stringTrim(temp_name);
return 1;



Using this, MAcdowell-o'rielly --> MacDowell-O'Rielly

Another possible enhancement - via a setup global, users could be given the option of applying UPPER or Proper case to selected field.
Last edited by antonio on Sat Feb 03, 2007 11:20 pm, edited 2 times in total.
Tony
Servoy 8 - 2022.03 LTS
antonio
 
Posts: 638
Joined: Sun Apr 02, 2006 2:14 am
Location: Australia

Postby ngervasi » Thu Apr 20, 2006 11:09 am

Thanks for sharing, Antonio!
Nicola Gervasi
sintpro.com
SAN Partner
ngervasi
 
Posts: 1485
Joined: Tue Dec 21, 2004 12:47 pm
Location: Arezzo, Italy

Postby Harjo » Thu Apr 20, 2006 11:17 am

Great tip!
Harjo Kompagnie
ServoyCamp
Servoy Certified Developer
Servoy Valued Professional
SAN Developer
Harjo
 
Posts: 4321
Joined: Fri Apr 25, 2003 11:42 pm
Location: DEN HAM OV, The Netherlands

Postby antonio » Thu Apr 20, 2006 11:23 am

Thanks, I'd be interested to hear from anyone who can see other cases that could be included with a rule, such as d'Souza or van der Hayden.
Tony
Servoy 8 - 2022.03 LTS
antonio
 
Posts: 638
Joined: Sun Apr 02, 2006 2:14 am
Location: Australia

Postby swingman » Thu Apr 20, 2006 3:41 pm

Just one last thought.

Wonder if anyone has written a regEx to do this kind of thing? regEx is very powerful, but cryptic. You may be able to replace the whole thing with one line of code ;-) Try Google.
Christian Batchelor
Certified Servoy Developer
Batchelor Associates Ltd, London, UK
http://www.batchelorassociates.co.uk

http://www.postgresql.org - The world's most advanced open source database.
User avatar
swingman
 
Posts: 1472
Joined: Wed Oct 01, 2003 10:20 am
Location: London

Postby antonio » Sat Feb 03, 2007 11:04 pm

I never did find a Regex that would do the trick, but this one might be close.

http://livetrix.wiki.ub.rug.nl/index.php/Features/Author_name_normalization

The trick is - there are some names like Macey that don't change case, which a method can be deisgned to handle.

If you see a way to do this with Regex more efficiently, I'd love to learn more.
Tony
Servoy 8 - 2022.03 LTS
antonio
 
Posts: 638
Joined: Sun Apr 02, 2006 2:14 am
Location: Australia


Return to How To

Who is online

Users browsing this forum: No registered users and 7 guests