search - How to find special characters in Linux Vim

07
2014-07
  • Marcus Thornton

    I want to find special characters in a text file. It is known that the UTF-8 encoded file contains

    Chinese characters , 
    "-", 
    "^A"(control-A, which is one of special characters), 
    numbers, 
    alphabets, and 
    some other characters. <- This is what I want to find out.
    

    I'm using Vim in Linux to find other special characters.

    I used

    /[^^A0-9a-zA-Z-] 
    

    to find that, but this will also show Chinese characters. How do filter Chinese characters and show only the other special characters in the file?

  • Answers
  • Ingo Karkat

    The Unicode codepoint range for CJK UNIFIED IDEOGRAPHS is 0x4E00-0x9FFF; you'd have to exclude that range of characters from your [...] collection (probably using the \%uNNNN regular expression atom).

    Unfortunately, Vim currently cannot search for ranges larger than 256 characters, so you'd have to combine multiple collections ([...]\|[...]\|[...]\|...), or choose a different approach.


  • Related Question

    macros - Is there anyway to have vim not count special characters as words?
  • leeand00

    I'm using VIM do alot of work for me using the macros.

    There's alot of text in columns and I want the macro to move between columns effortlessly by pressing the w key to "move to the beginning of the next word"

    For example:

    DataSourceName            string                       ""   
    DetailFields              []string                          
    DynamicControlBorder      boolean                 empty  may be void 
    EscapeProcessing          boolean                    True   
    FetchDirection            long                       1000   
    FetchSize                 long                         12   
    Filter                    string                       ""   
    GroupBy                   string                       ""   
    HavingClause              string                       ""
    

    However when I do this, VIM only does this for letters; whenever it encounters a "[" or a " it interprets this as another word, messing up the macro because it now appears that there is an additional column.

    Is there any setting I can change to make vim ignore the special characters and treat them just like the letters by skipping over them?


  • Related Answers
  • mas

    You can change the definition of a word in vim by using

    :set iskeyword=<specification>
    

    Remeber to change it back, though, when you have finished with the special usage.

    :set iskeyword?
    

    will show the current usage. My terminal responds with

    iskeyword=@,48-57,_,192-255

    for all the letters a-z and A-Z (@), digits 0 to 9 (ASCII 48-57), underscore and international letters (ASCII 192-255)