2021SC@SDUSC


前言:本篇博客继续上篇内容上篇内容,来分析Zxing中Data Matrix的代码,解释Data Matrix码encode是如何实现的。


一、XXEncoder

在上一篇博客中提到,DataMatrix编码的第一步骤是生成码字,需要将原始信息转换成DataMatrix的码字,生成的码字范围(0,255)即unsigned char。通常的编码方式为ASCII编码,将原始字符+1即生成码字;同时为了压缩码长,若其中含有连续的两位数字,则将其+130后,生成一个unsigned char。如果要进一步压缩码长,还可以混合其他的编码方式:
在这里插入图片描述
在介绍各个编码方式之前,先说明一下,它们都继承于同一个接口Encoder:

interface Encoder {

  int getEncodingMode(); 
  void encode(EncoderContext context);
}

它们都有使用同一个final类EncoderContext:(相关set、get等函数已省略)

final class EncoderContext {

  private final String msg;
  private SymbolShapeHint shape;
  private Dimension minSize;
  private Dimension maxSize;
  private final StringBuilder codewords;
  int pos;
  private int newEncoding;
  private SymbolInfo symbolInfo;
  private int skipAtEnd;

  EncoderContext(String msg) {
    //  从这一点上讲,字符串不再是Unicode了
    byte[] msgBinary = msg.getBytes(StandardCharsets.ISO_8859_1);
    StringBuilder sb = new StringBuilder(msgBinary.length);
    for (int i = 0, c = msgBinary.length; i < c; i++) {
      char ch = (char) (msgBinary[i] & 0xff);
      if (ch == '?' && msg.charAt(i) != '?') {
        throw new IllegalArgumentException("Message contains characters outside ISO-8859-1 encoding.");
      }
      sb.append(ch);
    }
    this.msg = sb.toString(); // 这里不是Unicode
    shape = SymbolShapeHint.FORCE_NONE;
    this.codewords = new StringBuilder(msg.length());
    newEncoding = -1;
  }
 
  public void updateSymbolInfo(int len) {
    if (this.symbolInfo == null || len > this.symbolInfo.getDataCapacity()) {
      this.symbolInfo = SymbolInfo.lookup(len, shape, minSize, maxSize, true);
    }
  }
  public void resetSymbolInfo() {
    this.symbolInfo = null;
  }
}

1.1 ASCIIEncoder

ASCIIEncoder以ASCII编码方式编码Data Matrix二维码。
其中,encode里的switch部分用来切换编码方式,当编码不是本类的方式时,每种编码方式都会转到ASCIIEncoder来进行切换。切换代码如下(在后面的介绍中将省略此部分)

if (newMode != getEncodingMode()) {
        // 返回到ASCII编码,进行切换到新模式
        context.signalEncoderChange(HighLevelEncoder.ASCII_ENCODATION);
        break;
      }

ASCIIEncoder详细代码如下:

final class ASCIIEncoder implements Encoder {
  @Override
  public void encode(EncoderContext context) {
    //step B
    int n = HighLevelEncoder.determineConsecutiveDigitCount(context.getMessage(), context.pos);
    if (n >= 2) {
      context.writeCodeword(encodeASCIIDigits(context.getMessage().charAt(context.pos),
                                              context.getMessage().charAt(context.pos + 1)));
      context.pos += 2;
    } else {
      char c = context.getCurrentChar();
      int newMode = HighLevelEncoder.lookAheadTest(context.getMessage(), context.pos, getEncodingMode());
      if (newMode != getEncodingMode()) {
        switch (newMode) {
          case HighLevelEncoder.BASE256_ENCODATION:
            context.writeCodeword(HighLevelEncoder.LATCH_TO_BASE256);
            context.signalEncoderChange(HighLevelEncoder.BASE256_ENCODATION);
            return;
          case HighLevelEncoder.C40_ENCODATION:
            context.writeCodeword(HighLevelEncoder.LATCH_TO_C40);
            context.signalEncoderChange(HighLevelEncoder.C40_ENCODATION);
            return;
          case HighLevelEncoder.X12_ENCODATION:
            context.writeCodeword(HighLevelEncoder.LATCH_TO_ANSIX12);
            context.signalEncoderChange(HighLevelEncoder.X12_ENCODATION);
            break;
          case HighLevelEncoder.TEXT_ENCODATION:
            context.writeCodeword(HighLevelEncoder.LATCH_TO_TEXT);
            context.signalEncoderChange(HighLevelEncoder.TEXT_ENCODATION);
            break;
          case HighLevelEncoder.EDIFACT_ENCODATION:
            context.writeCodeword(HighLevelEncoder.LATCH_TO_EDIFACT);
            context.signalEncoderChange(HighLevelEncoder.EDIFACT_ENCODATION);
            break;
          default:
            throw new IllegalStateException("Illegal mode: " + newMode);
        }
      } else if (HighLevelEncoder.isExtendedASCII(c)) {
        context.writeCodeword(HighLevelEncoder.UPPER_SHIFT);
        context.writeCodeword((char) (c - 128 + 1));
        context.pos++;
      } else {
        context.writeCodeword((char) (c + 1));
        context.pos++;
      }
    }
  }

  private static char encodeASCIIDigits(char digit1, char digit2) {
    if (HighLevelEncoder.isDigit(digit1) && HighLevelEncoder.isDigit(digit2)) {
      int num = (digit1 - 48) * 10 + (digit2 - 48);
      return (char) (num + 130);
    }
    throw new IllegalArgumentException("not digits: " + digit1 + digit2);
  }

}

1.2 C40Encoder

C40Encoder以C40编码方式编码Data Matrix二维码。

class C40Encoder implements Encoder {
  @Override
  public void encode(EncoderContext context) {
    //step C
    StringBuilder buffer = new StringBuilder();
    while (context.hasMoreCharacters()) {
      char c = context.getCurrentChar();
      context.pos++;

      int lastCharSize = encodeChar(c, buffer);

      int unwritten = (buffer.length() / 3) * 2;

      int curCodewordCount = context.getCodewordCount() + unwritten;
      context.updateSymbolInfo(curCodewordCount);
      int available = context.getSymbolInfo().getDataCapacity() - curCodewordCount;

      if (!context.hasMoreCharacters()) {
        // 避免在最后三元组中使用单个C40值
        StringBuilder removed = new StringBuilder();
        if ((buffer.length() % 3) == 2 && available != 2) {
          lastCharSize = backtrackOneCharacter(context, buffer, removed, lastCharSize);
        }
        while ((buffer.length() % 3) == 1 && (lastCharSize > 3 || available != 1)) {
          lastCharSize = backtrackOneCharacter(context, buffer, removed, lastCharSize);
        }
        break;
      }

      int count = buffer.length();
      if ((count % 3) == 0) {
        int newMode = HighLevelEncoder.lookAheadTest(context.getMessage(), context.pos, getEncodingMode());
      }
    }
    handleEOD(context, buffer);
  }

  private int backtrackOneCharacter(EncoderContext context,
                                    StringBuilder buffer, StringBuilder removed, int lastCharSize) {
    int count = buffer.length();
    buffer.delete(count - lastCharSize, count);
    context.pos--;
    char c = context.getCurrentChar();
    lastCharSize = encodeChar(c, removed);
    context.resetSymbolInfo(); // 处理符号大小可能减少的问题
    return lastCharSize;
  }

  static void writeNextTriplet(EncoderContext context, StringBuilder buffer) {
    context.writeCodewords(encodeToCodewords(buffer));
    buffer.delete(0, 3);
  }

  private static String encodeToCodewords(CharSequence sb) {
    int v = (1600 * sb.charAt(0)) + (40 * sb.charAt(1)) + sb.charAt(2) + 1;
    char cw1 = (char) (v / 256);
    char cw2 = (char) (v % 256);
    return new String(new char[] {cw1, cw2});
  }

}

其中,有encodeChar和handleEOD方法,子类会对其进行重写。

int encodeChar(char c, StringBuilder sb) {
    if (c == ' ') {
      sb.append('\3');
      return 1;
    }
    if (c >= '0' && c <= '9') {
      sb.append((char) (c - 48 + 4));
      return 1;
    }
    if (c >= 'A' && c <= 'Z') {
      sb.append((char) (c - 65 + 14));
      return 1;
    }
    if (c < ' ') {
      sb.append('\0'); //Shift 1 Set
      sb.append(c);
      return 2;
    }
    if (c <= '/') {
      sb.append('\1'); //Shift 2 Set
      sb.append((char) (c - 33));
      return 2;
    }
    if (c <= '@') {
      sb.append('\1'); //Shift 2 Set
      sb.append((char) (c - 58 + 15));
      return 2;
    }
    if (c <= '_') {
      sb.append('\1'); //Shift 2 Set
      sb.append((char) (c - 91 + 22));
      return 2;
    }
    if (c <= 127) {
      sb.append('\2'); //Shift 3 Set
      sb.append((char) (c - 96));
      return 2;
    }
    sb.append("\1\u001e"); //Shift 2, Upper Shift
    int len = 2;
    len += encodeChar((char) (c - 128), sb);
    return len;
  }

handleEOD用来处理“数据结束”情况。
其中,context为编码器上下文,
buffer为使用剩余的编码字符创建缓冲区。

  void handleEOD(EncoderContext context, StringBuilder buffer) {
    int unwritten = (buffer.length() / 3) * 2;
    int rest = buffer.length() % 3;

    int curCodewordCount = context.getCodewordCount() + unwritten;
    context.updateSymbolInfo(curCodewordCount);
    int available = context.getSymbolInfo().getDataCapacity() - curCodewordCount;

    if (rest == 2) {
      buffer.append('\0'); //Shift 1
      while (buffer.length() >= 3) {
        writeNextTriplet(context, buffer);
      }
      if (context.hasMoreCharacters()) {
        context.writeCodeword(HighLevelEncoder.C40_UNLATCH);
      }
    } else if (available == 1 && rest == 1) {
      while (buffer.length() >= 3) {
        writeNextTriplet(context, buffer);
      }
      if (context.hasMoreCharacters()) {
        context.writeCodeword(HighLevelEncoder.C40_UNLATCH);
      }
      // else no unlatch 否则就没有解锁
      context.pos--;
    } else if (rest == 0) {
      while (buffer.length() >= 3) {
        writeNextTriplet(context, buffer);
      }
      if (available > 0 || context.hasMoreCharacters()) {
        context.writeCodeword(HighLevelEncoder.C40_UNLATCH);
      }
    } else {
      throw new IllegalStateException("Unexpected case. Please report!");
    }
    context.signalEncoderChange(HighLevelEncoder.ASCII_ENCODATION);
  }

1.3 TextEncoder

TextEncoder以Text编码方式编码Data Matrix二维码。与其他编码不同的是,它继承自C40Encoder,重写了encodeChar方法。

final class TextEncoder extends C40Encoder {
  @Override
  int encodeChar(char c, StringBuilder sb) {
    if (c == ' ') {
      sb.append('\3');
      return 1;
    }
    if (c >= '0' && c <= '9') {
      sb.append((char) (c - 48 + 4));
      return 1;
    }
    if (c >= 'a' && c <= 'z') {
      sb.append((char) (c - 97 + 14));
      return 1;
    }
    if (c < ' ') {
      sb.append('\0'); //Shift 1 Set
      sb.append(c);
      return 2;
    }
    if (c <= '/') {
      sb.append('\1'); //Shift 2 Set
      sb.append((char) (c - 33));
      return 2;
    }
    if (c <= '@') {
      sb.append('\1'); //Shift 2 Set
      sb.append((char) (c - 58 + 15));
      return 2;
    }
    if (c >= '[' && c <= '_') {
      sb.append('\1'); //Shift 2 Set
      sb.append((char) (c - 91 + 22));
      return 2;
    }
    if (c == '`') {
      sb.append('\2'); //Shift 3 Set
      sb.append((char) 0); // '`' - 96 == 0
      return 2;
    }
    if (c <= 'Z') {
      sb.append('\2'); //Shift 3 Set
      sb.append((char) (c - 65 + 1));
      return 2;
    }
    if (c <= 127) {
      sb.append('\2'); //Shift 3 Set
      sb.append((char) (c - 123 + 27));
      return 2;
    }
    sb.append("\1\u001e"); //Shift 2, Upper Shift
    int len = 2;
    len += encodeChar((char) (c - 128), sb);
    return len;
  }

}

1.4 X12Encoder

X12Encoder以X12编码方式编码Data Matrix二维码。与TextEncoder相同,它继承自C40Encoder。

final class X12Encoder extends C40Encoder {
  @Override
  public void encode(EncoderContext context) {
    //step C
    StringBuilder buffer = new StringBuilder();
    while (context.hasMoreCharacters()) {
      char c = context.getCurrentChar();
      context.pos++;

      encodeChar(c, buffer);

      int count = buffer.length();
      if ((count % 3) == 0) {
        writeNextTriplet(context, buffer);

        int newMode = HighLevelEncoder.lookAheadTest(context.getMessage(), context.pos, getEncodingMode());
      }
    }
    handleEOD(context, buffer);
  }

  @Override
  int encodeChar(char c, StringBuilder sb) {
    switch (c) {
      case '\r':
        sb.append('\0');
        break;
      case '*':
        sb.append('\1');
        break;
      case '>':
        sb.append('\2');
        break;
      case ' ':
        sb.append('\3');
        break;
      default:
        if (c >= '0' && c <= '9') {
          sb.append((char) (c - 48 + 4));
        } else if (c >= 'A' && c <= 'Z') {
          sb.append((char) (c - 65 + 14));
        } else {
          HighLevelEncoder.illegalCharacter(c);
        }
        break;
    }
    return 1;
  }

  @Override
  void handleEOD(EncoderContext context, StringBuilder buffer) {
    context.updateSymbolInfo();
    int available = context.getSymbolInfo().getDataCapacity() - context.getCodewordCount();
    int count = buffer.length();
    context.pos -= count;
    if (context.getRemainingCharacters() > 1 || available > 1 ||
        context.getRemainingCharacters() != available) {
      context.writeCodeword(HighLevelEncoder.X12_UNLATCH);
    }
    if (context.getNewEncoding() < 0) {
      context.signalEncoderChange(HighLevelEncoder.ASCII_ENCODATION);
    }
  }
}

1.5 EdifactEncoder

EdifactEncoder以Edifact编码方式编码Data Matrix二维码。

final class EdifactEncoder implements Encoder {
  @Override
  public void encode(EncoderContext context) {
    //step F
    StringBuilder buffer = new StringBuilder();
    while (context.hasMoreCharacters()) {
      char c = context.getCurrentChar();
      encodeChar(c, buffer);
      context.pos++;

      int count = buffer.length();
      if (count >= 4) {
        context.writeCodewords(encodeToCodewords(buffer));
        buffer.delete(0, 4);

        int newMode = HighLevelEncoder.lookAheadTest(context.getMessage(), context.pos, getEncodingMode());
      }
    }
    buffer.append((char) 31); //解锁
    handleEOD(context, buffer);
  }
   // 处理“数据结束”情况
   // context:编码器上下文
   // buffer: 使用剩余的编码字符创建缓冲区
  private static void handleEOD(EncoderContext context, CharSequence buffer) {
    try {
      int count = buffer.length();
      if (count == 0) {
        return; // 完成
      }
      if (count == 1) {
        // 最后只有一个解锁
        context.updateSymbolInfo();
        int available = context.getSymbolInfo().getDataCapacity() - context.getCodewordCount();
        int remaining = context.getRemainingCharacters();
        // 以下两行代码是一个灵感来源于https://sourceforge.net/p/barcode4j/svn/221/
        if (remaining > available) {
          context.updateSymbolInfo(context.getCodewordCount() + 1);
          available = context.getSymbolInfo().getDataCapacity() - context.getCodewordCount();
        }
        if (remaining <= available && available <= 2) {
          return; // 不解锁
        }
      }

      if (count > 4) {
        throw new IllegalStateException("Count must not exceed 4");
      }
      int restChars = count - 1;
      String encoded = encodeToCodewords(buffer);
      boolean endOfSymbolReached = !context.hasMoreCharacters();
      boolean restInAscii = endOfSymbolReached && restChars <= 2;

      if (restChars <= 2) {
        context.updateSymbolInfo(context.getCodewordCount() + restChars);
        int available = context.getSymbolInfo().getDataCapacity() - context.getCodewordCount();
        if (available >= 3) {
          restInAscii = false;
          context.updateSymbolInfo(context.getCodewordCount() + encoded.length());
        }
      }

      if (restInAscii) {
        context.resetSymbolInfo();
        context.pos -= restChars;
      } else {
        context.writeCodewords(encoded);
      }
    } finally {
      context.signalEncoderChange(HighLevelEncoder.ASCII_ENCODATION);
    }
  }

  private static void encodeChar(char c, StringBuilder sb) {
    if (c >= ' ' && c <= '?') {
      sb.append(c);
    } else if (c >= '@' && c <= '^') {
      sb.append((char) (c - 64));
    } else {
      HighLevelEncoder.illegalCharacter(c);
    }
  }

  private static String encodeToCodewords(CharSequence sb) {
    int len = sb.length();
    if (len == 0) {
      throw new IllegalStateException("StringBuilder must not be empty");
    }
    char c1 = sb.charAt(0);
    char c2 = len >= 2 ? sb.charAt(1) : 0;
    char c3 = len >= 3 ? sb.charAt(2) : 0;
    char c4 = len >= 4 ? sb.charAt(3) : 0;

    int v = (c1 << 18) + (c2 << 12) + (c3 << 6) + c4;
    char cw1 = (char) ((v >> 16) & 255);
    char cw2 = (char) ((v >> 8) & 255);
    char cw3 = (char) (v & 255);
    StringBuilder res = new StringBuilder(3);
    res.append(cw1);
    if (len >= 2) {
      res.append(cw2);
    }
    if (len >= 3) {
      res.append(cw3);
    }
    return res.toString();
  }

}

1.6 Base256Encoder

Base256Encoder以Base256编码方式编码Data Matrix二维码。

final class Base256Encoder implements Encoder {
  @Override
  public void encode(EncoderContext context) {
    StringBuilder buffer = new StringBuilder();
    buffer.append('\0'); //Initialize length field
    while (context.hasMoreCharacters()) {
      char c = context.getCurrentChar();
      buffer.append(c);

      context.pos++;

      int newMode = HighLevelEncoder.lookAheadTest(context.getMessage(), context.pos, getEncodingMode());
    }
    int dataCount = buffer.length() - 1;
    int lengthFieldSize = 1;
    int currentSize = context.getCodewordCount() + dataCount + lengthFieldSize;
    context.updateSymbolInfo(currentSize);
    boolean mustPad = (context.getSymbolInfo().getDataCapacity() - currentSize) > 0;
    if (context.hasMoreCharacters() || mustPad) {
      if (dataCount <= 249) {
        buffer.setCharAt(0, (char) dataCount);
      } else if (dataCount <= 1555) {
        buffer.setCharAt(0, (char) ((dataCount / 250) + 249));
        buffer.insert(1, (char) (dataCount % 250));
      } else {
        throw new IllegalStateException(
            "Message length not in valid ranges: " + dataCount);
      }
    }
    for (int i = 0, c = buffer.length(); i < c; i++) {
      context.writeCodeword(randomize255State(
          buffer.charAt(i), context.getCodewordCount() + 1));
    }
  }

  private static char randomize255State(char ch, int codewordPosition) {
    int pseudoRandom = ((149 * codewordPosition) % 255) + 1;
    int tempVariable = ch + pseudoRandom;
    if (tempVariable <= 255) {
      return (char) tempVariable;
    } else {
      return (char) (tempVariable - 256);
    }
  }

}
Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐