Print Page - About string split method

RMRK RPG Maker Creation => RPG Maker General => General Scripting => Topic started by: TheoAllen on May 17, 2013, 01:16:16 AM

Title: About string split method
Post by: TheoAllen on May 17, 2013, 01:16:16 AM

Back then, when I started to learn how to read notetag, I used yanfly's way to split string each line in notetag through this function.

Code: [Select]

self.note.split(/[\r\n]+/).each do |line|

I didn't know how its work. And I just follow.

Now, I just realize that if I want to split string each line, It can be done by write this command

Code: [Select]

self.note.split(/\n/).each do |line|

The question is, what's the difference between those two commands?

Title: Re: About string split method
Post by: LoganF on May 17, 2013, 02:55:31 PM

The visible differences pretty much sums it up.

In Yanfly's version, the string is split at a line feed (\n) or carriage return (\r). Your version only takes into account line feeds and not carriage return. The \r\n is a commonly seen implementation that comes from the DOS days, I believe.

I would stick to the traditional version that Yanfly uses.

There's probably other more subtle differences that others, like modern, may be able to comment on. I'm not an expert on regular expressions to offer much insight (and I don't have the time at the moment to research into it).

Title: Re: About string split method
Post by: Zeriab on May 20, 2013, 07:42:30 PM

The string.split (http://www.ruby-doc.org/core-1.9.1/String.html#method-i-split) method partitions into elements according to the specified separator.
Consider a comma separated file (CSV) which you for example can open with Excel and it splits the elements in a line neatly into different rows. It does not take the actual commas along. This is analogous with how the string split method works. The separator is not added to elements in the array. Now instead of a specific character we define separator by a regular expression where Yanfly uses [\r\n]+

First step to understanding the different between the two lines is to understand what the regular expression matches. A good heuristic is to decompose to the simplest entities and build up.
We have [ ] with the meaning of (any character in the set). If we for example have /[ab]/ then it is equivalent to /a|b/.
There are two possible matches "a" and "b".

The + means 1 or more repetitions. /a+/ matches "a", "aa", "aaa", ...
Since the [] is treated as a single entity for the + we have that /[ab]+/ matches for example "abbaba", "baab" and "a".
Substitute a and b with \n and \r to get Yanfly's equation.
We can now easily see that a separator is defined as any non-empty substring containing only line feed and carriage return characters.

Compare with your line where a separator is a single line feed.
Including the carriage return is considered good practice because you would then cover both the Windows and Unix cases.
The important part, however, is considering a separator to be specifically one line feed character.
This is important because of the way the string.split method treats separators.

Try to figure it out on your own how it works. When you feel ready, try to guess what the following code will output before running it:

Code: [Select]

str = "Test 1\nTest 2\r\n\n\n\r\n5\n\r642345\n2gj"

arr1 = []
arr2 = []

str.split(/[\r\n]+/).each do |line|
  arr1 << line
end

str.split(/\n/).each do |line|
  arr2 << line
end

str1 = arr1.join
str2 = arr2.join

p str, str1, str2, arr1, arr2
#msgbox_p str, str1, str2, arr1, arr2 # Use if run in VX Ace
exit

Remember that one is not necessarily better than the other. Not in general. Which to pick depends on your specific context.

*hugs*
- Zeriab

Title: Re: About string split method
Post by: LoganF on May 21, 2013, 09:10:39 PM

Thanks for the input Zeriab. I always enjoy reading.

You have this very strange mystical presence when you descend to share your wisdom and knowledge. I feel bad for not remembering to include you as one of those who will know more about the subject.

Title: Re: About string split method
Post by: TheoAllen on May 22, 2013, 11:12:02 PM

Sorry for late reply. I almost forgot about this thread. Btw, that's explains a lot. I like to study by examples, experiments and its outputs. And make my own conclusion and explanation. Now I understand lil bit more about regular expressions. Thanks very much...

*solved*

The RPG Maker Resource Kit

RMRK RPG Maker Creation => RPG Maker General => General Scripting => Topic started by: TheoAllen on May 17, 2013, 01:16:16 AM