Friday, April 2, 2010

GREP query

Hi,

I am sure this is simple but I cannot for the life of me work it out...

How to select the last tab (%26gt;%26gt;) character and the text that follows, but not the paragraph marker (闇?.

Text1 %26gt;%26gt; Text2 %26gt;%26gt; Text3闇?/span>

I am I need to use the ''shortest match'' code in the Repeat section but just can't get it right.

Any help would be greatly appreciated.

Thanks

GREP query

Hi

You can try this :

\t\w+(?=$)

L. Tournier

GREP query

I'm not sure about the GREP expression, because I'm not sure what your text consists of, but I sometimes use the following strange technique for text with a fixed numbers of tabs on each line:

1. Convert text to table.

2. Insert a new column just to the left of last column.

3. Fill every cell in that new column with a special character (you can do this by copying and pasting a long repetitive string of special characters and paragraph breaks into the newly-inserted column of the original table).

4. Convert the table back into text. Now you have another tab and the special character in front of the last tab in every paragraph of the original text.

5. Now use easy GREP to remove the new tab, and use the special character to isolate the text between it and the end of the paragraph.

It's kind of crazy, but it works for me!

Thanks L. Tournier,

This is closer than I have got. Unfortunately is doesn't quite do the job. The text after the last tab could any characters including spaces, punctuation and escapable characters.

I get the \t\w+ bit (tab character followed by any word character(s), but what does (?=$) do?

Thanks

Simon Kemp

How about this

\t\w+$

Perhaps with this :

\t[^\t]+(?=\r)

(?=$) is a Positive lookahead with the wildcard of the end of the paragraph.

[^\t] means everything except a tab one or more times.

I think (?=\r) in your case is better than (?=$)

Difficult for me to explain in english. With the Positive lookahead, you match only the characters if they are followed by the paragrah mark, but you don't select the paragraph mark.

L. Tournier

Good one, but simpler as \t\w+$

Thanks, works perfectly.

Your explanation also make sense and I have learnt a little bit more.

Simon Kemp.

Thanks Eugene,

I also like to keep things as simple as possible. Your suggestion was one of the expressions I tried myself. Unfortunately it does not find the text if it contains none word characters such as hyphens, punctuation and slashes (like in file paths).

Simon Kemp.

Hmm.. you're example didn't include that.

This might be an alt

\t\w+([[:punct:]]+)$

Perhaps with this :

\t[^\t]+(?=\r)

Thanks ''pour les compliments''.

The only characters you can use to match something except something are : \W, \L, \U, \D and \S.

For negative character classe [^], I have learn it probably in Peter Kahrel's PDF : GREP and InDesign CS3 (O'Reilly). Jeffrey E. F. Friedl, in Mastering regular expressions, talks about them also.

L. Tournier

I have learn it probably in Peter Kahrel's PDF : GREP and InDesign CS3 (O'Reilly).

No comments:

Post a Comment