1. std::str

代表的是Unicode string slices.

Rust有两大string类型,一个是&str(其内容为borrowed),另一个是String

常用的string声明为&str类型:

let hello_world = "Hello, World!"; //声明了一个字符串字面量。

上面例子声明了一个字符串字面量,字符串字面量具有static生命周期,即hello_world变量将在整个程序运行期间均有效valid。与下面的明确声明等价:

let hello_world: &'static str = "Hello, world!";

2. std::str::from_utf8

pub fn from_utf8(v: &[u8]) -> Result<&str, Utf8Error>

主要作用为:将字节数组转换为字符串。
Converts a slice of bytes to a string slice.

并不是所有的字节数组都有相应的字符串表示,返回值为&str表示为有UTF-8字节数组对应的有效字符串;返回值为Utf8Error表示不具有有效的字符串表示。若不需要判断是否有有效的字符串表示,可用from_utf8_unchecked来实现。

use std::str;

// some bytes, in a vector
let sparkle_heart = vec![240, 159, 146, 150];

// We know these bytes are valid, so just use `unwrap()`.
let sparkle_heart = str::from_utf8(&sparkle_heart).unwrap();

assert_eq!("?", sparkle_heart);
use std::str;

// some invalid bytes, in a vector
let sparkle_heart = vec![0, 159, 146, 150];

assert!(str::from_utf8(&sparkle_heart).is_err());
use std::str;

// some bytes, in a stack-allocated array
let sparkle_heart = [240, 159, 146, 150];

// We know these bytes are valid, so just use `unwrap()`.
let sparkle_heart = str::from_utf8(&sparkle_heart).unwrap();

assert_eq!("?", sparkle_heart);

3. std::string::String::from_utf8

pub fn from_utf8(vec: Vec<u8>) -> Result<String, FromUtf8Error>

有效返回值为String而不是&str

// some bytes, in a vector
let sparkle_heart = vec![240, 159, 146, 150];

// We know these bytes are valid, so we'll use `unwrap()`.
let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();

assert_eq!("?", sparkle_heart);
// some invalid bytes, in a vector
let sparkle_heart = vec![0, 159, 146, 150];

assert!(String::from_utf8(sparkle_heart).is_err());

4. as_bytes()函数

pub fn as_bytes(&self) -> &[u8]

将字符串转换为字节数组。若需再将字符数组转化为字符串,可借助上面提到的str::from_utf8函数。

let bytes = "bors".as_bytes();
assert_eq!(b"bors", bytes);

5. len()和as_ptr()函数

pub fn len(&self) -> usize

返回self的字节长度。

let len = "foo".len();
assert_eq!(3, len);

let len = "ƒoo".len(); // fancy f!
assert_eq!(4, len);
pub const fn as_ptr(&self) -> *const u8

将字符串转换为裸指针。

use std::slice;
use std::str;

let story = "Once upon a time...";

let ptr = story.as_ptr();
let len = story.len();

// story has nineteen bytes
assert_eq!(19, len);

// We can re-build a str out of ptr and len. This is all unsafe because
// we are responsible for making sure the two components are valid:
let s = unsafe {
    // First, we build a &[u8]...
    let slice = slice::from_raw_parts(ptr, len);

    // ... and then convert that slice into a string slice
    str::from_utf8(slice)
};

assert_eq!(s, Ok(story));

6. is_char_boundary()函数

pub fn is_char_boundary(&self, index: usize) -> bool

检测第index个字节是否是UTF-8 code point序列的开始或者结尾。

let s = "Löwe 老虎 Léopard";
assert!(s.is_char_boundary(0));
// start of `老`
assert!(s.is_char_boundary(6));
assert!(s.is_char_boundary(s.len()));

// second byte of `ö`
assert!(!s.is_char_boundary(2));

// third byte of `老`
assert!(!s.is_char_boundary(8));

参考资料:
[1] https://doc.rust-lang.org/std/str/index.html
[2] https://doc.rust-lang.org/std/primitive.str.html

Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐