vim - How to go to the n'th character, not byte, of a file?

07
2014-07
  • dotancohen

    In vim one can get to the 5th byte of the file with the following command:

    :goto 5
    

    However, in a UTF-8 text this could be the 5th, 4th, or even 2nd character in the file. How to go to the 5th character, not byte, of a file?

  • Answers
  • Ingo Karkat

    You can start from the beginning of the buffer, and use search() to match N characters. The only pitfall is considering the newline characters, too. Here's a custom gco mapping that does this:

    function! s:GoToCharacter( count )
        let l:save_view = winsaveview()
        " We need to include the newline position in the searches, too. The
        " newline is a character, too, and should be counted.
        let l:save_virtualedit = &virtualedit
        try
            let [l:fixPointMotion, l:searchExpr, l:searchFlags] = ['gg0', '\%#\_.\{' . (a:count + 1) . '}', 'ceW']
            silent! execute 'normal!' l:fixPointMotion
    
            if search(l:searchExpr, l:searchFlags) == 0
                " We couldn't reach the final destination.
                execute "normal! \<C-\>\<C-n>\<Esc>" | " Beep.
                call winrestview(l:save_view)
                return 0
            else
                return 1
            endif
        finally
            let &virtualedit = l:save_virtualedit
        endtry
    endfunction
    " We start at the beginning, on character number 1.
    nnoremap <silent> gco :<C-u>if ! <SID>GoToCharacter(v:count1 - 1)<Bar>echoerr 'No such position'<Bar>endif<Bar><CR>
    

    Note that this counts just one character for the CR-LF combination in buffers that have 'fileformat' set to dos.

  • Ben

    What about just gg0 to go to the first character in the file, then 4l to go work your way 5 characters in? I suppose this may fail if you have empty lines or if you want to count linebreaks.


  • Related Question

    How to use UTF-8 in vim on Mac OS X?
  • Tadeusz A. Kadłubowski

    I want to edit UTF-8 documents with vim (7.2 installed via MacPorts, big feautre set, iconv support enabled, multi-byte support enabled) on Mac OS X 10.4 within terminal.app.

    Terminal.app is configured to use Monaco font (which has good Unicode coverage) and use UTF-8 as the character set encoding.

    Keyboard map is set up correctly. I can enter some localized characters like „zażółć” and even quotes around that… (yeah, and an elypsis).

    I've done my best to set up the environment:

    LC_ALL=pl_PL.UTF-8
    LC_CTYPE=pl_PL.UTF-8
    LANG=pl_PL.UTF-8
    export LC_ALL
    export LC_CTYPE
    export LANG
    

    I have no encoding, fileencoding or termencoding set in .vimrc, so that it should default to what's set in the locale.

    What else have I missed? I can't enter non-ASCII UTF-8 characters in vim. It is interpreted as single-byte garbage rather than wider UTF-8 characters.


  • Related Answers
  • CoreSandello

    Check out this:

    (Thanks to Peter Vohmann for this Q&A.) In Terminal.app go the the Terminal (main) menu and choose Window Settings. Select Emulation from the popup menu, un-check the item "Escape non-ASCII characters". Then select Display from the popup menu, set Character Set Encoding to Unicode (UTF-8), if desired. Click on "Use settings as Default."

    (from MacVim Site)

    As far as I remember, 10.4 Terminal.app has some troubles, when dealing with UTF-8; checking setting above would, probably, help. As an alternative solution, consider using MacVim or iTerm as terminal application.

    Update: as Ben Stiglitz mentioned in comments, 10.4 Terminal is OK, but 10.4 bundled shells are not.

  • donut

    I don't know about Vim in the Terminal, but I have no troubles entering Korean characters in MacVim. This is with no extra setup, just as it came.