--- name: index signature: 'index STR,SUBSTR,POSITION' since: 5.0 status: documented categories: ["SCALARs and strings"] --- ```{index} single: index; Perl built-in ``` *[SCALARs and strings](../perlfunc-by-category)* # index Find the position of a substring inside a string. `index` scans `STR` left-to-right looking for the first occurrence of `SUBSTR` and returns the zero-based position where it starts. No regular-expression metacharacters, no case folding, no wildcards — `SUBSTR` is matched literally, character for character. When the search fails, `index` returns `-1`. ## Synopsis ```perl index STR, SUBSTR index STR, SUBSTR, POSITION ``` ## What you get back An integer. On a match, the zero-based offset of the first character of `SUBSTR` inside `STR`. On no match, `-1`. The sentinel is the idiomatic way to test: ```perl if (index($line, $needle) >= 0) { ... } # found if (index($line, $needle) == -1) { ... } # not found ``` `POSITION` and the return value use the **same** zero-based scale, so you can feed one back into the other to walk every occurrence: ```perl my $pos = -1; while (($pos = index($text, $needle, $pos + 1)) != -1) { push @hits, $pos; } ``` ## How POSITION is interpreted `POSITION` is the earliest offset the match is allowed to start at. The search still proceeds to the end of `STR`; `POSITION` does not bound the search, it only shifts where it begins. - `POSITION` omitted or [`undef`](undef) — search from offset `0`. - `POSITION` negative or otherwise before the start — treated as `0`. - `POSITION` past the end of `STR` — treated as the end, so the only way to match is if `SUBSTR` is the empty string (which matches at any offset, including the end). An empty `SUBSTR` always matches, and matches at `POSITION` (clamped into range). This follows from the "first position where `SUBSTR` occurs" rule: the empty string occurs everywhere. ```perl index("hello", ""); # 0 index("hello", "", 3); # 3 index("hello", "", 99); # 5 (clamped to end of string) ``` ## Examples Find a single character or a whole word: ```perl index("Perl is great", "P"); # 0 index("Perl is great", "g"); # 8 index("Perl is great", "great"); # 8 ``` Report a miss: ```perl index("Perl is great", "Z"); # -1 ``` Skip past an earlier match with `POSITION` to find the second occurrence: ```perl index("Perl is great", "e", 5); # 10 ``` Walk every occurrence of a substring: ```perl my $s = "abcabcabc"; my $p = -1; while (($p = index($s, "bc", $p + 1)) != -1) { print "hit at $p\n"; } # hit at 1 # hit at 4 # hit at 7 ``` A common idiom — test for containment without building a regex: ```perl if (index($path, "/tmp/") != -1) { warn "path touches /tmp"; } ``` Pairs naturally with [`substr`](substr) to split on the first occurrence of a separator: ```perl my $line = "key=value=with=equals"; my $eq = index($line, "="); my ($k, $v) = $eq >= 0 ? (substr($line, 0, $eq), substr($line, $eq + 1)) : ($line, undef); ``` ## Edge cases - **Empty `SUBSTR`** matches at `POSITION` (clamped into `STR`). `index($s, "")` is `0`; `index($s, "", $n)` is `$n` capped at `length $s`. Never `-1`. - **Empty `STR` with non-empty `SUBSTR`** returns `-1`. Empty `STR` with empty `SUBSTR` returns `0`. - **Negative `POSITION`** is clamped to `0`. `index` does **not** interpret negative offsets as "from the end" — that is [`rindex`](rindex)'s job, and even there the semantics differ. - **`POSITION` past the end of `STR`** is clamped to `length STR`, so only an empty `SUBSTR` can match. - **Undef arguments** stringify to `""` and trigger an `uninitialized` warning under `use warnings`. `index(undef, "x")` is `-1`; `index("abc", undef)` is `0` (empty-substring rule). - **Characters, not bytes.** `index` operates on the logical character sequence of the string. For a string of wide characters, the returned offset is a character offset, not a byte offset. If you need byte offsets, downgrade or encode the string first (`use bytes` for a lexical byte view, or `Encode::encode_utf8` to work on an octet string). - **Case sensitivity**: `index` is case-sensitive. Lowercase both arguments first if you want a case-insensitive search, or use `=~ /\Q$needle\E/i` and `@-` / `$-[0]` to recover the position. ## Differences from upstream Fully compatible with upstream Perl 5.42. ## See also - [`rindex`](rindex) — same matching rules, scans from the right and returns the offset of the **last** occurrence at or before `POSITION` - [`substr`](substr) — extract the matched region once `index` has located it, or replace it in place - [`length`](length) — upper bound for a valid `POSITION`; returns character length on the same scale `index` uses - [`pos`](pos) — position tracking for regex-based scanning; use together with `m//g` when you need captures rather than a raw offset - [`sprintf`](sprintf) — build the search string when `SUBSTR` is assembled from parts; `index` takes a literal, so prepare it first