国内知名
计算机技术平台

深入理解雪花算法

今天的主角雪花算法,它是Twitter开源的由64位整数组成分布式ID,性能较高,并且在单机上递增。 具体参考:https://github.com/twitter-archive/snowflake

SnowFlake是Twitter公司采用的一种算法,目的是在分布式系统中产生全局唯一且趋势递增的ID。

组成部分(64bit)

  1. 第一位 占用1bit,其值始终是0,没有实际作用。
  2. 时间戳 占用41bit,精确到毫秒,总共可以容纳约69年的时间。
  3. 工作机器id 占用10bit,其中高位5bit是数据中心ID,低位5bit是工作节点ID,做多可以容纳1024个节点。
  4. 序列号 占用12bit,每个节点每毫秒0开始不断累加,最多可以累加到4095,一共可以产生4096个ID。

SnowFlake算法在同一毫秒内最多可以生成多少个全局唯一ID呢:: 同一毫秒的ID数量 = 1024 X 4096 = 4194304

特性

  1. 支持自定义允许时间回拨的范围;
  2. 解决跨毫秒起始值每次为0开始的情况(避免末尾必定为偶数,而不便于取余使用问题)
  3. 解决高并发场景中获取时间戳性能问题<p>
  4. 支撑根据IP末尾数据作为workerId
  5. 时间回拨方案思考:1024个节点中分配10个点作为时间回拨序号(连续10次时间回拨的概率较小)

雪花算法的实现

SnowflakeGenerator.java

/**
 * @author Pang Yandong
 * @description 雪花算法生成器
 * @date 2022-06-18 11:52:01
 */
public class SnowflakeGenerator {
    private final Sequence sequence;

    public SnowflakeGenerator() {
        this.sequence = new Sequence();
    }

    public SnowflakeGenerator(long dataCenterId, long workerId) {
        this.sequence = new Sequence(dataCenterId, workerId);
    }

    public SnowflakeGenerator(Sequence sequence) {
        this.sequence = sequence;
    }

    public Long nextId() {
        return sequence.nextId();
    }
}

Sequence.java

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.lang.management.ManagementFactory;
import java.net.InetAddress;
import java.net.NetworkInterface;
import java.util.Enumeration;
import java.util.concurrent.ThreadLocalRandom;
import java.util.regex.Pattern;

/**
 * @author Pang Yandong
 * @description 雪花算法
 * @date 2022-06-18 11:52:22
 */
public class Sequence {
    private static final Logger log = LoggerFactory.getLogger(Sequence.class);

    // 时间起始标记点,作为基准,一般取系统的最近时间(一旦确定不可变动),这里取2022-06-06 00:00:00
    private final long twepoch = 1654444800000L;

    // 同一时间的序列
    private long sequence = 0L;;
    // 同一时间的序列所占的位数,12个bit,111111111111 = 4095,同一毫秒最多生成4096个
    private final long sequenceBits = 12L;
    // 用于序号的与运算,保证序号最大值在0-4095之间
    private final long sequenceMask = ~(-1L << sequenceBits);

    // 机器ID
    private long workerId;
    // 机器ID所占的位数,5个bit,最大为:11111(2进制),31(10进制)
    private final long workerIdBits = 5L;
    // 5 bit最多只能有31个数字,就是说机器id最多只能是32以内
    private final long maxWorkerId = ~(-1L << workerIdBits);
    // workerId的偏移量
    private final long workerIdShift = sequenceBits;

    // 数据中心(机房)id
    private long datacenterId;
    // 数据中心号的ID所占的位数,5个bit,最大为:11111(2进制),31(10进制)
    private final long datacenterIdBits = 5L;
    // 5 bit最多只能有31个数字,数据中心id最多只能是32以内
    private final long maxDatacenterId = ~(-1L << datacenterIdBits);
    // datacenterId的偏移量
    private final long datacenterIdShift = sequenceBits + workerIdBits;

    // 时间戳(上次生产 ID 时间戳)
    private long lastTimestamp = -1L;
    // timestampLeft的偏移量
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;

    private static volatile InetAddress LOCAL_ADDRESS = null;
    private static final Pattern IP_PATTERN = Pattern.compile("\\d{1,3}(\\.\\d{1,3}){3,5}$");

    /**
     * @description 无参构造器
     * @author Pang Yandong
     * @date 2022-06-18 10:40:43
     */
    public Sequence() {
        this.datacenterId = getDatacenterId(maxDatacenterId);
        this.workerId = getMaxWorkerId(datacenterId, maxWorkerId);
    }

    /**
     * @param datacenterId 数据中心ID
     * @param workerId 机器ID
     * @return
     * @description 有参构造器
     * @author Pang Yandong
     * @date 2022-06-18 10:41:03
     */
    public Sequence(long datacenterId, long workerId) {
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
    }

    /**
     * @param datacenterId
     * @param maxWorkerId
     * @return long
     * @description 基于 MAC + PID 的 hashcode 获取16个低位
     * @author Pang Yandong
     * @date 2022-06-18 11:21:57
     */
    protected static long getMaxWorkerId(long datacenterId, long maxWorkerId) {
        StringBuilder mpid = new StringBuilder();
        mpid.append(datacenterId);
        String name = ManagementFactory.getRuntimeMXBean().getName();
        if (name != null && name.length() > 0) {
            // 获取jvmPid
            mpid.append(name.split("@")[0]);
        }
        // MAC + PID 的 hashcode 获取16个低位
        return (mpid.toString().hashCode() & 0xffff) % (maxWorkerId + 1);
    }

    /**
     * @param maxDatacenterId
     * @return long
     * @description 基于网卡MAC地址计算余数作为数据中心id
     * @author Pang Yandong
     * @date 2022-06-18 11:22:08
     */
    protected static long getDatacenterId(long maxDatacenterId) {
        long id = 0L;
        try {
            NetworkInterface network = NetworkInterface.getByInetAddress(getLocalAddress());
            if (network == null) {
                id = 1L;
            } else {
                byte[] mac = network.getHardwareAddress();
                if (null != mac) {
                    id = ((0x000000FF & (long) mac[mac.length - 2]) | (0x0000FF00 & (((long) mac[mac.length - 1]) << 8))) >> 6;
                    id = id % (maxDatacenterId + 1);
                }
            }
        } catch (Exception e) {
            System.out.println("获取数据中心ID异常:" + e.getMessage());
        }
        return id;
    }

    /**
     * @param
     * @return long
     * @description 获取下一个随机的ID
     * @author Pang Yandong
     * @date 2022-06-18 10:50:24
     */
    public synchronized long nextId() {
        // 获取当前时间戳,单位毫秒
        long timestamp = timeGen();
        // 发生时钟回拨
        if (timestamp < lastTimestamp) {
            long offset = lastTimestamp - timestamp;
            if (offset <= 5) {
                try {
                    // 休眠双倍差值后重新获取,再次校验
                    wait(offset << 1);
                    timestamp = timeGen();
                    if (timestamp < lastTimestamp) {
                        throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds", offset));
                    }
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            } else {
                throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds", offset));
            }
        }

        // 如果是同一时间生成的,则进行毫秒内序列
        if (lastTimestamp == timestamp) {
            // 相同毫秒内,序列号自增
            sequence = (sequence + 1) & sequenceMask;
            // 毫秒内序列溢出
            if (sequence == 0) {
                // 阻塞到下一个毫秒,获得新的时间戳
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            // 不同毫秒内,序列号置为 1 - 3 随机数
            sequence = ThreadLocalRandom.current().nextLong(1, 3);
        }

        // 记录上一次的时间戳
        lastTimestamp = timestamp;

        // 偏移计算(时间戳部分 | 数据中心部分 | 机器标识部分 | 序列号部分)
        return ((timestamp - twepoch) << timestampLeftShift)
                | (datacenterId << datacenterIdShift)
                | (workerId << workerIdShift)
                | sequence;
    }

    /**
     * @param lastTimestamp 上次生成ID的时间截
     * @return long 当前时间戳
     * @description 阻塞到下一个毫秒,直到获得新的时间戳
     * @author Pang Yandong
     * @date 2022-06-18 10:55:53
     */
    private long tilNextMillis(long lastTimestamp) {
        // 获取最新时间戳
        long timestamp = timeGen();
        // 如果发现最新的时间戳小于或者等于序列号已经超4095的那个时间戳
        while (timestamp <= lastTimestamp) {
            // 不符合则继续
            timestamp = timeGen();
        }
        return timestamp;
    }

    /**
     * @return long 当前时间(毫秒)
     * @description 返回以毫秒为单位的当前时间
     * @author Pang Yandong
     * @date 2022-06-18 10:58:51
     */
    private long timeGen() {
        return SystemClock.now();
    }

    /**
     * @return java.net.InetAddress
     * @description 从本地网卡找到第一个有效的IP
     * @author Pang Yandong
     * @date 2022-06-18 11:49:35
     */
    public static InetAddress getLocalAddress() {
        if (LOCAL_ADDRESS != null) {
            return LOCAL_ADDRESS;
        }

        LOCAL_ADDRESS = getLocalAddress0();
        return LOCAL_ADDRESS;
    }

    /**
     * @return java.net.InetAddress
     * @description 获取IP地址
     * @author Pang Yandong
     * @date 2022-06-18 11:50:20
     */
    private static InetAddress getLocalAddress0() {
        InetAddress localAddress = null;
        try {
            localAddress = InetAddress.getLocalHost();
            if (isValidAddress(localAddress)) {
                return localAddress;
            }
        } catch (Throwable e) {
            log.warn("Failed to retrieving ip address, " + e.getMessage(), e);
        }

        try {
            Enumeration<NetworkInterface> interfaces = NetworkInterface.getNetworkInterfaces();
            if (interfaces != null) {
                while (interfaces.hasMoreElements()) {
                    try {
                        NetworkInterface network = interfaces.nextElement();
                        Enumeration<InetAddress> addresses = network.getInetAddresses();
                        while (addresses.hasMoreElements()) {
                            try {
                                InetAddress address = addresses.nextElement();
                                if (isValidAddress(address)) {
                                    return address;
                                }
                            } catch (Throwable e) {
                                log.warn("Failed to retrieving ip address, " + e.getMessage(), e);
                            }
                        }
                    } catch (Throwable e) {
                        log.warn("Failed to retrieving ip address, " + e.getMessage(), e);
                    }
                }
            }
        } catch (Throwable e) {
            log.warn("Failed to retrieving ip address, " + e.getMessage(), e);
        }

        log.error("Could not get local host ip address, will use 127.0.0.1 instead.");
        return localAddress;
    }

    /**
     * @param address
     * @return boolean
     * @description 校验地址是否有效
     * @author Pang Yandong
     * @date 2022-06-18 11:49:15
     */
    private static boolean isValidAddress(InetAddress address) {
        if (address == null || address.isLoopbackAddress()) {
            return false;
        }

        String name = address.getHostAddress();
        return (name != null && !"0.0.0.0".equals(name) && !"127.0.0.1".equals(name) && IP_PATTERN.matcher(name).matches());
    }
}

SystemClock.java

import java.sql.Timestamp;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;

/**
 * @author Pang Yandong
 * @description 高并发场景下System.currentTimeMillis()的性能问题的优化
 * @date 2022-06-18 11:19:25
 */
public class SystemClock {

    private final long period;
    private final AtomicLong now;

    private SystemClock(long period) {
        this.period = period;
        this.now = new AtomicLong(System.currentTimeMillis());
        scheduleClockUpdating();
    }

    private static SystemClock instance() {
        return InstanceHolder.INSTANCE;
    }

    public static long now() {
        return instance().currentTimeMillis();
    }

    public static String nowDate() {
        return new Timestamp(instance().currentTimeMillis()).toString();
    }

    private void scheduleClockUpdating() {
        ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor(runnable -> {
            Thread thread = new Thread(runnable, "System Clock");
            thread.setDaemon(true);
            return thread;
        });
        scheduler.scheduleAtFixedRate(() -> now.set(System.currentTimeMillis()), period, period, TimeUnit.MILLISECONDS);
    }

    private long currentTimeMillis() {
        return now.get();
    }

    private static class InstanceHolder {
        public static final SystemClock INSTANCE = new SystemClock(1);
    }
}

使用方法

将如上3个文件复制到项目中,如果分布式机器较少,建议使用雪花算法生成器的有参构造函数,指定数据中心ID和机器ID。如果机器较多见识使用默认的构造函数。

其他

文章源码参考MybatisPlus以及https://gitee.com/yu120/sequence

附:

Leaf是美团开源的分布式ID生成器,能保证全局唯一,趋势递增,但需要依赖关系数据库、Zookeeper等中间件。 具体参考:https://tech.meituan.com/MT_Leaf.html

赞(0) 打赏
未经允许不得转载:东云网 » 深入理解雪花算法
分享到: 更多 (0)

评论 抢沙发

评论前必须登录!

 

东云IT,与您偕行!

免责声明联系我们

觉得文章有用就打赏一下文章作者吧,么么哒~

非常感谢你的打赏,我们将继续给力更多优质内容,让我们一起创建更加美好的网络世界!

支付宝扫一扫打赏

微信扫一扫打赏