最近编码遇到一个问题，在读写相关的工具方法只支持ByteArray的前提下，如何保存和读取一个Long类型列表。这个问题看上去简单，但解决过程中却走了一些弯路，正好也回顾了一下相关的知识。

位运算

遇事不决问AI，这种解决方案简单的问题显然很适合让AI来写，于是我问得了初版结果：

fun listToByteArray(longList: List<Long>): ByteArray {
    val byteArray = ByteArray(longList.size * 8)
    for (i in longList.indices) {
        val longValue = longList[i]
        byteArray[i * 8] = (longValue shr 56).toByte()
        byteArray[i * 8 + 1] = (longValue shr 48).toByte()
        byteArray[i * 8 + 2] = (longValue shr 40).toByte()
        byteArray[i * 8 + 3] = (longValue shr 32).toByte()
        byteArray[i * 8 + 4] = (longValue shr 24).toByte()
        byteArray[i * 8 + 5] = (longValue shr 16).toByte()
        byteArray[i * 8 + 6] = (longValue shr 8).toByte()
        byteArray[i * 8 + 7] = longValue.toByte()
    }
    return byteArray
}

fun byteArrayToList(byteArray: ByteArray): List<Long> {
    val longList = mutableListOf<Long>()
    for (i in byteArray.indices step 8) {
        val longValue = (byteArray[i].toLong() shl 56) or
                        (byteArray[i + 1].toLong() shl 48) or
                        (byteArray[i + 2].toLong() shl 40) or
                        (byteArray[i + 3].toLong() shl 32) or
                        (byteArray[i + 4].toLong() shl 24) or
                        (byteArray[i + 5].toLong() shl 16) or
                        (byteArray[i + 6].toLong() shl 8) or
                        (byteArray[i + 7].toLong())
        longList.add(longValue)
    }
    return longList
}

Kotlin的Long类型占用空间为8Byte，上述代码很简单也相当符合我们的逻辑。

位运算是什么，为什么上述过程要用位运算？

位运算是一种直接操作二进制数中各个比特位的运算方式，我们需要按位取出一个Long变量中的每个Byte（4位）再分别保存，这个问题天然适合位运算。

如果不用位运算，我们可以除法和取模操作获取和保存每两位的数值，但操作麻烦。

Kotlin支持哪些位运算操作符？

操作符	解释	示例
and	按位与	0b0101 and 0b0011 == 0b0001
or	按位或	0b0101 or 0b0011 == 0b0111
xor	异或	0b0101 xor 0b0011 == 0b0110
inv	非	0b0101.inv() == 0b1010
shl	左移	0b0101 shl 1 == 0b1010
shr	算术右移	0b0101 shr 1 == 0b0010
ushr	逻辑右移	0b0101 shr 1 == 0b0010

shr 与 ushr 的区别？

shr 是算术右移，会保持符号位不变。即，符号位为0时补0，符号位为1时左边补1

符号拓展

AI给的代码真的正确吗？粗看没有问题，但如果你用这个输入试试，就发现不对：

fun main() {
    val originList = listOf(-1L, 0L, 1L, 2L, 4L, 8L, 16L, 32L, 64L, 128L, 256L, 65536L, 65537L)
    val converted = byteArrayToList(listToByteArray(originList))
    println("origin: $originList, converted: $converted")
}

// output:
// origin: [-1, 0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 65536, 65537], converted: [-1, 0, 1, 2, 4, 8, 16, 32, 64, -128, 256, 65536, 65537]

怎么刚超过65536就出错了，Long的范围显然不止为此。这是因为Byte也是有类型的，范围是-128~127，如果一个取出的Byte对应数字大小超过了127就会表示为相应的负值。而当我们尝试将这样一个负值转换为更大的单位Long时，就发生了所谓的“符号扩展”，即对应数据的高位直接被符号位填充。正数的符号位为0，扩充后也是0，表示含义没变；负数符号位为1，则前面的位都填充为1，但因为负数是以补码方式表示的，取反后还是0，所以不会改变数字的大小。

字节序

字节序是指多字节数据类型在内存中存储时的字节排布方式。对于大端字节序来说，高位字节存储在低地址，低位字节存储在高地址，而小端反之。

例如对于大端字节序，0x12345678 在内存中的表示为：12 34 56 78。而小端字节序中则为 78 56 34 12。

如果程序在两台字节序不同的电脑上交换数据，就可能因为字节序发生问题。因此涉及到这些场景时就要约定好字节序。

更简单的写法

其实用ByteBuffer工具就可以了，这种基础的代码能不自己写还是别自己写。

import java.nio.ByteBuffer

fun listToByteArray(longList: List<Long>): ByteArray {
    val byteBuffer = ByteBuffer.allocate(longList.size * 8) // 8 bytes per Long
    for (longValue in longList) {
        byteBuffer.putLong(longValue) // 将 Long 值写入 ByteBuffer
    }
    return byteBuffer.array() // 获取底层字节数组
}

fun byteArrayToList(byteArray: ByteArray): List<Long> {
    val byteBuffer = ByteBuffer.wrap(byteArray) // 将字节数组包装到 ByteBuffer
    val longList = mutableListOf<Long>()
    
    while (byteBuffer.remaining() >= 8) { // 检查剩余字节数是否足够
        longList.add(byteBuffer.getLong()) // 从 ByteBuffer 中读取 Long 值
    }
    
    return longList
}

AI生成代码不一定足够简洁和可靠，使用时还是得有自己分辨的能力。

最后修改于 2024-11-10