C++ 内存数据结构与二进制文件之间的序列化和反序列化

来源:互联网 发布:日本社交软件 编辑:程序博客网 时间:2024/05/23 01:47

应用场景

许多后端检索server启动时候需要从文件加载到内存中构建索引,这个过程往往会消耗比较多的时间,这样会造成sever启动消耗比较多的时间,在存在多台服务器的时候会更加明显。
我们可以将够构建索引的过程独立成一个单独的进程,此进程实现的功能是根据原始文件构建索引结构,并将索引结构序列化到本地二进制文件,Server在启动的时候只需要读取二进制文件就可以构造出索引结构,可以大大提高启动速度。

示例代码

io.hpp ,对std::ifstream 以及std::ofstream 的封装,提供从vector序列化到二进制文件和从二进制文件反序列化到vector等接口

#ifndef IO_HPP#define IO_HPP#include <string>#include <vector>#include <fstream>class FileReader{public:    FileReader(const std::string& filename)        : input_stream(filename,std::ios::binary)    {    }    /* Read count objects of type T into pointer dest */    template <typename T> void ReadInto(T *dest, const std::size_t count)    {        static_assert(std::is_trivially_copyable<T>::value,                      "bytewise reading requires trivially copyable type");        if (count == 0)            return;        const auto &result = input_stream.read(reinterpret_cast<char *>(dest), count * sizeof(T));        const std::size_t bytes_read = input_stream.gcount();        if (bytes_read != count * sizeof(T) && !result)        {            return;        }    }    template <typename T> void ReadInto(std::vector<T> &target)    {        ReadInto(target.data(), target.size());    }    template <typename T> void ReadInto(T &target)     {         ReadInto(&target, 1);     }    template <typename T> T ReadOne()    {        T tmp;        ReadInto(tmp);        return tmp;    }    std::uint32_t ReadElementCount32()     {         return ReadOne<std::uint32_t>();     }    std::uint64_t ReadElementCount64()     {         return ReadOne<std::uint64_t>();     }    template <typename T> void DeserializeVector(std::vector<T> &data)    {        const auto count = ReadElementCount64();        data.resize(count);        ReadInto(data.data(), count);    }private:    std::ifstream input_stream;};class FileWriter{public:    FileWriter(const std::string& filename)        : output_stream(filename,std::ios::binary)    {    }    /* Write count objects of type T from pointer src to output stream */    template <typename T> void WriteFrom(const T *src, const std::size_t count)    {        static_assert(std::is_trivially_copyable<T>::value,                      "bytewise writing requires trivially copyable type");        if (count == 0)            return;        const auto &result =            output_stream.write(reinterpret_cast<const char *>(src), count * sizeof(T));    }    template <typename T> void WriteFrom(const T &target)     {         WriteFrom(&target, 1);     }    template <typename T> void WriteOne(const T tmp)     {         WriteFrom(tmp);     }    void WriteElementCount32(const std::uint32_t count)     {         WriteOne<std::uint32_t>(count);     }    void WriteElementCount64(const std::uint64_t count)     {         WriteOne<std::uint64_t>(count);     }    template <typename T> void SerializeVector(const std::vector<T> &data)    {        const auto count = data.size();        WriteElementCount64(count);        return WriteFrom(data.data(), count);    }private:    std::ofstream output_stream;};#endif

binary_io.cpp

#include "io.hpp"#include <iostream>struct Data{    int a;    double b;    friend std::ostream& operator<<(std::ostream& out,const Data& data)    {        out << data.a << "," << data.b;        return out;    }};template<typename T>void printData(const std::vector<T>& data_vec){    for (const auto data : data_vec)    {        std::cout << "{" << data << "} ";    }    std::cout << std::endl;}template<typename T>void serializeVector(const std::string& filename,const std::vector<T>& data_vec){    FileWriter file_writer(filename);    file_writer.SerializeVector<T>(data_vec);}template<typename T>void deserializeVector(const std::string& filename,std::vector<T>& data_vec){    FileReader file_reader(filename);    file_reader.DeserializeVector<T>(data_vec);}int main(){    std::vector<Data> vec1 = {{1,1.1},{2,2.2},{3,3.3},{4,4.4}};    std::cout << "before write to binary file.\n";    printData(vec1);    const std::string filename = "vector_data";    std::cout << "serialize vector to binary file.\n";    serializeVector<Data>(filename,vec1);    std::vector<Data> vec2;    deserializeVector<Data>(filename,vec2);    std::cout << "vector read from binary file.\n";    printData(vec2);    return 0;}

编译代码

g++ -std=c++11 binary_io.cpp -o binary_io

执行程序

./binary_io

执行结果
执行结果
程序将内存中vector 数据写入二进制文件,并从二进制文件中反序列化到一个新的vector。可以看到序列化前和序列化后的结果一致。

注意

序列化到文件的数据结构需要满足 is_trivially_copyable。std::is_trivially_copyable 在c++11 引入,TriviallyCopyable类型对象有以下性质

每个拷贝构造函数是trivial 或者是deleted每个移动构造函数是trivial 或者是deleted每个拷贝赋值运算符是trivial 或者是deleted每个移动赋值运算符是trivial 或者是deleted以上至少有一个是non-deleted析构函数是trivial 并且non-deleted

对于is_trivially_copyable 类型对象的性质,解释如下

Objects of trivially-copyable types are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read(). In general, a trivially copyable type is any type for which the underlying bytes can be copied to an array of char or unsigned char and into a new object of the same type, and the resulting object would have the same value as the original

只有满足trivially-copyable的对象才可以保证序列化到二进制文件后, 从二进制文件反序列化到内存后的值保持不变。

原创粉丝点击