objective c - Find and replace long words in an NSString? -


i'm trying write method search nsstring, determine if individual word within string on 6 characters long , replace word other word (something arbitrary 'hello').

i starting long paragraph , need end single nsstring object format , spacing has not been affected find , replace.

why answer?

there couple of subtle problems simple solutions using componentsseparatedbystring::

  1. punctuation not handled word delimiters.
  2. whitespace other space character (newline, tab) dropped.
  3. on long strings lot of memory wasted.
  4. it's slow.

example

assuming substitution word of "–" string ...

“essentially,” d.h.c. concluded,
”bokanovskification consists of series of arrests of development.”

... result in ...

– d.h.c. – – of series of – of –

... while correct output be:

“–,” d.h.c. –,
”– – of series of – of –.”

solution

fortunately there's better, yet simple solution in cocoa: -[nsstring enumeratesubstringsinrange:options:usingblock:]

it provides fast iteration on substrings defined options argument. 1 possibility nsstringenumerationbywords enumerates substrings real words (in current locale). detects individual words in languages don't use delimiters (spaces) separate words, japanese.

comparing solutions

here's simple demo project works on jargon file (1.6 mb, 237,239 words). compares 3 different solutions:

  1. componentsseparatedbystring: 270 ms
  2. enumeratesubstringsinrange: 125 ms
  3. stringbyreplacingoccurrencesofstring, described @monolo: 200 ms

implementation

the core of replacement loop:

nsmutablestring *result = [nsmutablestring stringwithcapacity:[originalstring length]]; __block nsuinteger location = 0; [originalstring enumeratesubstringsinrange:(nsrange){0, [originalstring length]}                                    options:nsstringenumerationbywords | nsstringenumerationlocalized | nsstringenumerationsubstringnotrequired                                 usingblock:^(nsstring *substring, nsrange substringrange, nsrange enclosingrange, bool *stop) {                                      if (substringrange.length > maxchar) {                                         nsstring *charactersbetweenlongwords = [originalstring substringwithrange:(nsrange){ location, substringrange.location - location }];                                         [result appendstring:charactersbetweenlongwords];                                         [result appendstring:replaceword];                                         location = substringrange.location + substringrange.length;                                     }                                  }]; [result appendstring:[originalstring substringfromindex:location]]; 

caveat

as pointed out monolo proposed code uses nsstring's length determine number of characters of word. that's questionable approach, least. in fact string's length specifies number of code fragments used encode string, value defers human assume number of characters.

as term "character" has different meanings in various contexts , op didn't specify kind of character count use leave code was. if want different count please refer documentation discusses topic:


Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

java - Using an Integer ArrayList in Android -