-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ACP: Bigint helper methods #228
Comments
What is the motivation for putting the carry as the second field in the tuple? My intuition is that it should be in the first position, as that's the most significant position in numbers. Also, it seems odd to return bool when most use cases are just going to convert it back to a number anyways. Some extra |
When working with a bigint with limbs with a little endian ordering (i.e. limbs arranged least to most significant) having the carry in the second position of the tuple feels more natural and translates better to having similar carrying arithmetic on the bigints themselves. |
other operations we'll likely want that i don't think were mentioned are unsigned-signed widening/carrying multiply |
A couple, actually. First, this mirrors the existing APIs for methods like
This is also incorrect for a few reasons. Firstly, the carry bit is passed among operations, so you should never have to do Also adding this one since it got posted mid-writing this:
I'm not 100% sure of the motivation for this, unless we actually change the representation of the least-significant part of a signed multiplication to be unsigned, which is still undecided. I personally am a fan of keeping all of the parts of a signed operation signed, since values that don't overflow are meant to be interpreted as signed. Actually, in a varint format I've been working on (code not linkable atm due to unrelated issues), I personally use signed integers throughout all values inside a two's complement slice, since it just makes more sense to use a slice instead of tuples. |
the motivation is for when multiplying twos complement bigints, the most significant limb is signed and the rest are unsigned, therefore we need unsigned-signed multiplication. carrying signed-unsigned mul would be like so (idk which would be /// 8-bit version for ease of exposition
pub fn mul_carrying_unsigned_signed(a: u8, b: i8, carry: i8) -> (u8, i8) {
let prod = a as i16 * b as i16; // can't overflow -- in -0x7F80..=0x7E81
let sum = prod + carry as i16; // can't overflow -- in -0x8000..=0x7F00
// equivalent to carry = sum.div_floor(0x100)
// equivalent to result = sum.mod_floor(0x100)
let carry = sum >> 8; // in -0x80..=0x7F
let result = sum & 0xFF; // in 0x0..=0xFF
// result is unsigned because twos complement integers are defined
// as modified unsigned integers by reassigning the MSB's value as
// -2^(width-1) rather than 2^(width-1), all lower bits still are
// positive and *not* signed.
(result as u8, carry as i8)
}
// example usage (again 8-bit for exposition):
/// twos complement bigint
pub struct BigInt {
pub limbs: Vec<u8>, // little-endian
pub msb_limb: i8, // only the msb is a sign bit so only msb_limb is signed
}
pub fn mul_biguint_by_int(biguint_limbs: &[u8], int: i8, carry_in: i8) -> BigInt {
let mut carry = carry_in;
let mut limbs = vec![];
for limb in biguint_limbs {
let (result, carry_out) = mul_carrying_unsigned_signed(*limb, int, carry);
limbs.push(result);
carry = carry_out;
}
BigInt {
limbs,
msb_limb: carry,
}
}
#[test]
fn test_it() {
let a = 0x1234_5678_9ABC_DEF0_i64.to_le_bytes();
let b = -0x10i8;
let carry_in = -0x42i8;
let result = mul_biguint_by_int(&a, b, carry_in);
let result_u64 = u64::from_le_bytes(result.limbs.try_into().unwrap());
let expected = -0x1_2345_6789_ABCD_EF42_i128;
assert_eq!(result.msb_limb, (expected >> 64) as i8);
assert_eq!(result_u64, expected as u64);
} |
note that RISC-V has a multiply unsigned-signed |
I had no idea that was a common implementation, although I guess that I haven't really had too much exposure to the wider variety of implementations. My assumption was that everything either had a separate sign bit or was two's complement throughout, not that there would be these sorts of hybrid implementations. I still question whether we want to make the signed-signed multiplication return (signed, unsigned) instead of (signed, signed), but these sorts of implementations definitely call out the need to offer signed-unsigned multiplication regardless. |
an example of why signed outputs should be represented as a signed high half and unsigned low half: #[test]
fn test_fn() {
let wide: i64 = (-1 << 36) + (1 << 31);
let hi_s = (wide >> 32) as i32;
// correct since `wide`'s sign bit is not in the low half
let lo_u = wide as u32;
let lo_s = wide as i32; // incorrect
let hi_value = i64::from(hi_s) << 32;
// from is value-preserving
let lo_u_value = i64::from(lo_u);
let lo_s_value = i64::from(lo_s);
// lo_u correctly represents the value of the low half
assert_eq!(hi_value + lo_u_value, wide);
// lo_s *doesn't* correctly represent the value of the low half
assert_ne!(hi_value + lo_s_value, wide);
} |
unsigned bigint * signed scalar can be used as part of a 2's complement bigint * bigint multiply |
Closing this, since there is already a tracking issue for discussing the (unstable) feature: rust-lang/rust#85532 If there are any thoughts on the thread here that are not reflected on the tracking issue, please make sure it's discussed there. Thank you. |
Proposal
Note: this is a retroactive ACP for a feature that has already exists in unstable and is tracked at rust-lang/rust#85532. It's not 100% complete, but exists to start discussing this feature through the new ACP process rather than in the tracking issue.
For the sake of brevity, the term "big integer" will be shortened to "bigint" and refers to aggregate integer types, usually those represented by slices of primitive integer types stored in some heap vector. It includes the standard "big integer" types which represent unbounded-size integers, but also aggregate integer types needed to implement larger integer types like those needed for many cryptographical operations.
Problem statement
The standard library does not offer its own big-integer types, which is left as a task for the larger ecosystem. However, the actual primitive operations required to implement big-integer types are supported by hardware instructions on many architectures, and can be added onto integers without committing to any specific API for the big-integers themselves.
These operations include things like:
Motivating examples or use cases
Although it's understandable that the standard library does not offer a big-integer type due to the complexity and variety of the potential APIs, the underlying primitive operations needed for them are a good fit for the standard library:
overflowing_add
, etc.The main benefit of this is so that crates in the ecosystem like
num-bigint
and more can focus on offering a specific API, rather than also having to juggle the optimisations needed for an efficient implementation. While many of these crates will inevitably still resort to inline assembly, they'll be able to fall back to efficient operations in libstd without having to worry about sacrificing too much performance on other platforms.Solution sketch
This is the existing implementation:
Note that this will likely be augmented to include more operations (like shifts, division, and remainder), and potentially resolve existing potential issues (the term "borrowing" instead of "carrying" for everything, the return type of widening operations, etc.).
Alternatives
The primary alternative to offering this API is to not offer it. There are already some architecture-specific intrinsics available like
core::arch::x86::_addcarry_u64
, but this only avoids inline assembly rather than avoiding architecture-specific optimisations altogether.Links and related work
You can find most of this in the tracking issue: rust-lang/rust#85532.
What happens now?
This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
Second, if there's a concrete solution:
The text was updated successfully, but these errors were encountered: