Merge dev into master (0.01.001)

This commit is contained in:
Charlie Malmqvist 2024-07-20 16:11:20 +02:00
commit dd2c9e3a09
29 changed files with 1608 additions and 3205 deletions

View file

@ -1,30 +1,79 @@
ooga booga ooga booga
## TOC ## TOC
- [Getting started](#getting-started)
- [What is ooga booga?](#what-is-ooga-booga) - [What is ooga booga?](#what-is-ooga-booga)
- [A new C Standard](#a-new-c-standard)
- [SIMPLICITY IS KING](#simplicity-is-king)
- [The "Build System"](#the-build-system)
- [Course: From Scratch to Steam](#course-from-scratch-to-steam)
- [Quickstart](#quickstart) - [Quickstart](#quickstart)
- [The "Build System"](#the-build-system)
- [Examples & Documentation](#examples--documentation) - [Examples & Documentation](#examples--documentation)
- [Known bugs](#known-bugs) - [Known bugs](#known-bugs)
- [Licensing](#licensing) - [Licensing](#licensing)
## Getting started
If you'd like to learn how to use the engine to build a game, there's a completely free course in the [Skool community](https://www.skool.com/game-dev)
You can find all tutorials and resources for getting started within the community.
## What is ooga booga? ## What is ooga booga?
Ooga booga, often referred to as a *game engine* for simplicity, is more so designed to be a new C Standard, i.e. a new way to develop software from scratch in C. Other than `<math.h>` we don't include a single C std header, but are instead writing a better standard library heavily optimized for developing games. Except for some image & audio file decoding, Ooga booga does not rely on any other third party code. Ooga booga, often referred to as a *game engine* for simplicity, is more so designed to be a new C Standard, i.e. a new way to develop software from scratch in C. Other than `<math.h>` we don't include a single C std header, but are instead writing a better standard library heavily optimized for developing games. Except for some image & audio file decoding, Ooga booga does not rely on any other third party code.
### A new C Standard
Let's face it. The C standard is terrible. Don't even get me started on `string.h`. To be fair, any mainstream language standard is terrible.
So what if we could strip out the nonsense standard of C and slap on something that's specifically made for video games, prioritizing speed and *simplicity*?
That's exactly what oogabooga sets out to do.
### SIMPLICITY IS KING
Ooga booga is designed to keep things simple, and let you solve video game problems the simplest way possible.
What we mean by simple, is twofold:
1. <b>Simple to use</b><br>
Performing SIMPLE and TRIVIAL tasks should be ... SIMPLE. If you want to draw a rectangle, there should be a single procedure to draw a rectangle. If you want to play an audio clip, there should be a single procedure to play an audio clip. Etc. This is something OS & Graphics API's tend to be fascinatingly terrible at even for the most trivial of tasks, and that is a big chunk of what we set out to solve.
2. <b>Simple to understand</b><br>
When you need to do something more complicated, you need to understand the library you're working with. For some reason, it seems like it's a standard for libraries today to obscure the implementation details as much as possible spread out in layers and layers of procedure calls and abstractions. This is terrible.
In Oogabooga, there is none of that. We WANT you to delve into our implementations and see exactly what we do. We do not hide ANYTHING from you. We do not impose RESTRICTIONS on how you solve problems. If you need to know what a procedure does, you search for the symbol and look at the implementation code. That's it.
### The "Build System"
Our build system is a build.c and a build.bat which invokes the clang compiler on build.c. That's it. And we highly discourage anyone from introducing unnecessary complexity like a third party build system (cmake, premake) or to use header files at all whatsoever.
This might sound like we are breaking some law, but we're not. We're using a compiler to compile a file which includes all the other files, it doesn't get simpler. We are NOT using third party software to run the same compiler to compile the same files over and over again and write it all to disk to then try and link it together. That's what we call silly business (and unreasonably slow compile times, without any real benefit).
Oogabooga is made to be used in Unity builds. The idea is that you only include oogabooga.c somewhere in your project, specify the entry (see build.c) and now it's a Oogabooga project. Oogabooga is meant to replace the C standard, so it is not tested with projects which include standard C headers, so that will probably cause issues.
## Course: From Scratch to Steam
This project was started to be used in a course detailing the full ride from starting out making a game to publishing it to Steam. If you're keen on going all-in on getting a small game published to steam within 2-3 months, then check it out for free in our [Skool Community](https://www.skool.com/game-dev).
## Quickstart ## Quickstart
1. Install clang, add to path Currently, we only support Windows x64 systems.
1. Make sure Windows SDK is installed
2. Install clang, add to path
2. Clone repo to <project_dir> 2. Clone repo to <project_dir>
3. Make a file my_file.c in <project_dir> 3. Make a file my_file.c in <project_dir>
``` ```
int entry(int argc, char **argv) { int entry(int argc, char **argv) {
print("Ooga, booga!\n");
window.title = STR("Minimal Game Example");
window.scaled_width = 1280; // We need to set the scaled size if we want to handle system scaling (DPI)
window.scaled_height = 720;
window.x = 200;
window.y = 90;
window.clear_color = hex_to_rgba(0x6495EDff);
while (!window.should_close) {
reset_temporary_storage();
os_update();
gfx_update();
}
return 0;
} }
``` ```
4. in build.c add this line to the bottom 4. in build.c add this line to the bottom
@ -35,11 +84,6 @@ int entry(int argc, char **argv) {
6. Run build/cgame.exe 6. Run build/cgame.exe
7. profit 7. profit
## The "Build System"
Our build system is a build.c and a build.bat which invokes the clang compiler on build.c. That's it. And we highly discourage anyone from introducing unnecessary complexity like a third party build system (cmake, premake) or to use header files at all whatsoever.
This might sound like we are breaking some law, but we're not. We're using a compiler to compile a file which includes all the other files, it doesn't get simpler. We are NOT using third party software to run the same compiler to compile the same files over and over again and write it all to disk to then try and link it together. That's what we call silly business (and unreasonably slow compile times, without any real benefit).
## Examples & Documentation ## Examples & Documentation
Documentation will come in the form of a lot of examples because that's the best way to learn and understand how everything works. Documentation will come in the form of a lot of examples because that's the best way to learn and understand how everything works.
@ -52,6 +96,7 @@ Other than examples, a great way to learn is to delve into the code of whatever
## Known bugs ## Known bugs
- Window positioning & sizing is fucky wucky - Window positioning & sizing is fucky wucky
- Converting 24-bit audio files doesn't really work
## Licensing ## Licensing
By default, the repository has an educational license that makes the engine free to use for personal projects. By default, the repository has an educational license that makes the engine free to use for personal projects.

36
TODO
View file

@ -1,33 +1,45 @@
- Audio - Audio
- Player volume control - Avoid playing identical audio at the same time
- Allow audio programming
- Inject mixer proc per player
- Inject a mixer proc before and after filling output buffer
- Better spacialization
- Fix vorbis >FILE< streaming (#StbVorbisFileStream)
- Handle audio sources which had their format defaulted to update when default device changes
For streamed audio sources this should be easy enough, because the conversion happens from raw format to source format as we stream it.
For loaded sources though, we would need to convert the source->pcm_frames.
- Optimize - Optimize
- Spam simd - Spam simd
- Concurrent jobs for players? - Concurrent jobs for players?
- Mega buffer for contiguous intermediate buffers - Mega buffer for contiguous intermediate buffers
We definitely also want a limit to how much memory we want allocated to intermediate buffers. We definitely also want a limit to how much memory we want allocated to intermediate buffers.
- Avoid playing identical audio at the same time
- Custom audio mixing
- Fix vorbis >FILE< streaming (#StbVorbisFileStream)
- Handle audio sources which had their format defaulted to update when default device changes
For streamed audio sources this should be easy enough, because the conversion happens from raw format to source format as we stream it.
For loaded sources though, we would need to convert the source->pcm_frames.
- Bugs: - Bugs:
- Small fadeout on pause is slightly noisy - Small fadeout on pause is slightly noisy
- Setting time stamp/progression causes noise (need fade transition like on pause/play) - Setting time stamp/progression causes noise (need fade transition like on pause/play)
- End of clip also causes noise if audio clip does not end smoothly - End of clip also causes noise if audio clip does not end smoothly
- Setting audio source to a format which differs from audio output format in both channels and bit_width at the same time will produce pure loud noise. - Setting audio source to a format which differs from audio output format in both channels and bit_width at the same time will produce pure loud noise.
- 24-Bit audio conversion doesn't really work - 24-Bit audio conversion doesn't really work
- Converting 24-bit audio files doesn't really work
- Juice
- draw_line
- vx_length
- Occlude quads out of view
- General bugs & issues - General bugs & issues
- Release freeze in run_tests - Release freeze in run_tests
- Window width&height is zero when minimized (and we make a 0x0 swap chain)
- Window positioning & sizing is fucky wucky
- Memory error messages are misleading when no VERY_DEBUG
- Renderer
- API to pass constant values to shader (codegen #define's)
- Fonts
- Atlases are way too big, render atlases with size depending on font_height (say, 128 codepoints per atlas)
- OS
Window::bool is_minimized
don't set window.width & window.height to 0
- Needs testing: - Needs testing:
- Audio format channel conversions - Audio format channel conversions
- sample rate downsampling - sample rate downsampling
- non stereo or mono audio - non stereo or mono audio
- Audio spacialization on anything that's not stereo

11
build.c
View file

@ -5,6 +5,16 @@
#define INITIAL_PROGRAM_MEMORY_SIZE MB(5) #define INITIAL_PROGRAM_MEMORY_SIZE MB(5)
// You might want to increase this if you get a log warning saying the temporary storage was overflown.
// In many cases, overflowing the temporary storage should be fine since it just wraps back around and
// allocations made way earlier in the frame are likely not used anymore.
// This might however not always be the case, so it's probably a good idea to make sure you always have
// enough temporary storage for your game.
#define TEMPORARY_STORAGE_SIZE MB(2)
// Enable VERY_DEBUG if you are having memory bugs to detect things like heap corruption earlier.
// #define VERY_DEBUG 1
typedef struct Context_Extra { typedef struct Context_Extra {
int monkee; int monkee;
} Context_Extra; } Context_Extra;
@ -33,6 +43,7 @@ typedef struct Context_Extra {
// #include "oogabooga/examples/renderer_stress_test.c" // #include "oogabooga/examples/renderer_stress_test.c"
// #include "oogabooga/examples/tile_game.c" // #include "oogabooga/examples/tile_game.c"
// #include "oogabooga/examples/audio_test.c" // #include "oogabooga/examples/audio_test.c"
// #include "oogabooga/examples/custom_shader.c"
// This is where you swap in your own project! // This is where you swap in your own project!
// #include "entry_yourepicgamename.c" // #include "entry_yourepicgamename.c"

View file

@ -1,3 +1,43 @@
## v0.01.001 - Spacial audio, custom shading, scissor boxing
- Audio
- Implemented spacial audio playback
Simply set player->position (it's in ndc space (-1 to 1), see audio_test.c)
Save some computation with player->disable_spacialization = true
- Added position overloads for play_one_clip
play_one_audio_clip_source_at_position(source, pos)
play_one_audio_clip_at_position(path, pos)
- Implemented volume control with player->volume
- Renderer
- Implemented custom shading
- Recompile shader with your extension to the pixel stage with
shader_recompile_with_extension(source, cbuffer_size)
- pass a buffer to the shader constant buffer at b0 with
draw_frame.cbuffer = &my_cbuffer_data
- pass userdata in the form of Vector4's
define VERTEX_2D_USER_DATA_COUNT for the amount of Vertex4 userdata's to be part of each vertex.
You can set the userdata in Draw_Quad->userdata which is a Vector4[VERTEX_2D_USER_DATA_COUNT].
See custom_shading.c example.
- Added window scissor boxing (screen space)
push_window_scissor(min, max);
pop_window_scissor();
- Added draw_circle() and draw_circle_xform in drawing.c
- Made an example for custom shading (oogabooga/examples/custom_shading.c)
- Embed default shader into codebase & always compile
- Added draw_line(p0, p1, width, color)
- Implemented culling of quads out of view
- Fixed culling bug where big rectangles that overlapped the screen but had all corners outside the screen would get culled.
- Misc
- Improved text measure and added a better explanation for it in font.c.
- Added some useful Vector procedures:
vx_length()
vx_normalize()
vx_average()
vx_dot()
vx_abs()
vx_cross()
- added os_get_file_size_from_Path()
- Some simple restructuring of existing code
- Made heap corruption detection more robust
## v0.01.000 - AUDIO! ## v0.01.000 - AUDIO!
- Added audio sources - Added audio sources

View file

@ -1,5 +1,4 @@
/* /*
Loading audio: Loading audio:
@ -12,6 +11,8 @@
void play_one_audio_clip_source(Audio_Source source); void play_one_audio_clip_source(Audio_Source source);
void play_one_audio_clip(string path); void play_one_audio_clip(string path);
void play_one_audio_clip_source_at_position(Audio_Source source, Vector3 pos);
void play_one_audio_clip_at_position(string path, Vector3 pos);
Playing audio (with players): Playing audio (with players):
@ -387,8 +388,8 @@ wav_read_frames(Wav_Stream *wav, Audio_Format format, void *frames,
return 0; return 0;
} }
bool raw_is_float = wav->format == 0x0003; bool raw_is_float32 = wav->format == 0x0003;
bool raw_is_f32 = raw_is_float && wav->valid_bits_per_sample == 32; bool raw_is_f32 = raw_is_float32 && wav->valid_bits_per_sample == 32;
bool raw_is_int = wav->format == 0x0001; bool raw_is_int = wav->format == 0x0001;
bool raw_is_s16 = raw_is_int && wav->valid_bits_per_sample == 16; bool raw_is_s16 = raw_is_int && wav->valid_bits_per_sample == 16;
@ -978,8 +979,8 @@ resample_frames(void *dst, Audio_Format dst_format,
void *dst_comp = (u8*)dst_frame + c * dst_comp_size; void *dst_comp = (u8*)dst_frame + c * dst_comp_size;
if (src_format.bit_width == AUDIO_BITS_32) { if (src_format.bit_width == AUDIO_BITS_32) {
float sample_1 = *((f32*)src_comp_1); float32 sample_1 = *((f32*)src_comp_1);
float sample_2 = *((f32*)src_comp_2); float32 sample_2 = *((f32*)src_comp_2);
f32 s = sample_1 + lerp_factor * (sample_2 - sample_1); f32 s = sample_1 + lerp_factor * (sample_2 - sample_1);
memcpy(dst_comp, &s, sizeof(f32)); memcpy(dst_comp, &s, sizeof(f32));
} else if (src_format.bit_width == AUDIO_BITS_16) { } else if (src_format.bit_width == AUDIO_BITS_16) {
@ -1154,10 +1155,15 @@ typedef struct Audio_Player {
u64 fade_frames; u64 fade_frames;
u64 fade_frames_total; u64 fade_frames_total;
bool release_when_done; bool release_when_done;
// I think we only need to sync when audio thread samples the source, which should be // I think we only need to sync when audio thread samples the source, which should be
// very quick and low contention, hence a spinlock. // very quick and low contention, hence a spinlock.
Spinlock sample_lock; Spinlock sample_lock;
// These can be set safely
Vector3 position; // ndc space -1 to 1
bool disable_spacialization;
float32 volume;
} Audio_Player; } Audio_Player;
#define AUDIO_PLAYERS_PER_BLOCK 128 #define AUDIO_PLAYERS_PER_BLOCK 128
typedef struct Audio_Player_Block { typedef struct Audio_Player_Block {
@ -1181,6 +1187,7 @@ audio_player_get_one() {
memset(&block->players[i], 0, sizeof(block->players[i])); memset(&block->players[i], 0, sizeof(block->players[i]));
block->players[i].allocated = true; block->players[i].allocated = true;
block->players[i].volume = 1.0;
return &block->players[i]; return &block->players[i];
} }
@ -1201,6 +1208,7 @@ audio_player_get_one() {
last->next = new_block; last->next = new_block;
new_block->players[0].allocated = true; new_block->players[0].allocated = true;
new_block->players[0].volume = 1.0;
return &new_block->players[0]; return &new_block->players[0];
} }
@ -1325,14 +1333,20 @@ Hash_Table just_audio_clips;
bool just_audio_clips_initted = false; bool just_audio_clips_initted = false;
void void
play_one_audio_clip_source(Audio_Source source) { play_one_audio_clip_source_at_position(Audio_Source source, Vector3 pos) {
Audio_Player *p = audio_player_get_one(); Audio_Player *p = audio_player_get_one();
audio_player_set_source(p, source, false); audio_player_set_source(p, source, false);
audio_player_set_state(p, AUDIO_PLAYER_STATE_PLAYING); audio_player_set_state(p, AUDIO_PLAYER_STATE_PLAYING);
p->position = pos;
p->release_when_done = true; p->release_when_done = true;
} }
void inline
play_one_audio_clip_source(Audio_Source source) {
play_one_audio_clip_source_at_position(source, v3(0, 0, 0));
}
void void
play_one_audio_clip(string path) { play_one_audio_clip_at_position(string path, Vector3 pos) {
if (!just_audio_clips_initted) { if (!just_audio_clips_initted) {
just_audio_clips_initted = true; just_audio_clips_initted = true;
just_audio_clips = make_hash_table(string, Audio_Source, get_heap_allocator()); just_audio_clips = make_hash_table(string, Audio_Source, get_heap_allocator());
@ -1340,7 +1354,7 @@ play_one_audio_clip(string path) {
Audio_Source *src_ptr = hash_table_find(&just_audio_clips, path); Audio_Source *src_ptr = hash_table_find(&just_audio_clips, path);
if (src_ptr) { if (src_ptr) {
play_one_audio_clip_source(*src_ptr); play_one_audio_clip_source_at_position(*src_ptr, pos);
} else { } else {
Audio_Source new_src; Audio_Source new_src;
bool ok = audio_open_source_load(&new_src, path, get_heap_allocator()); bool ok = audio_open_source_load(&new_src, path, get_heap_allocator());
@ -1349,8 +1363,13 @@ play_one_audio_clip(string path) {
return; return;
} }
hash_table_add(&just_audio_clips, path, new_src); hash_table_add(&just_audio_clips, path, new_src);
play_one_audio_clip_source(new_src); play_one_audio_clip_source_at_position(new_src, pos);
} }
}
void inline
play_one_audio_clip(string path) {
play_one_audio_clip_at_position(path, v3(0, 0, 0));
} }
void void
@ -1406,6 +1425,163 @@ audio_apply_fade_out(void *frames, u64 number_of_frames, Audio_Format format,
} }
} }
void apply_audio_spacialization_mono(void* frames, Audio_Format format, u64 number_of_frames, Vector3 pos) {
// No idea if this actually gives the perception of audio being positioned.
// I also don't have a mono audio device to test it.
float* audio_data = (float*)frames;
float32 distance = sqrtf(pos.x * pos.x + pos.y * pos.y + pos.z * pos.z);
float32 attenuation = 1.0f / (1.0f + distance);
float32 alpha = 0.1f;
float32 prev_sample = 0.0f;
u64 comp_size = get_audio_bit_width_byte_size(format.bit_width);
u64 frame_size = comp_size * format.channels;
for (u64 i = 0; i < number_of_frames; ++i) {
float32 sample = audio_data[i];
convert_one_component(
&sample,
AUDIO_BITS_32,
(u8*)frames+i*frame_size,
format.bit_width
);
sample *= attenuation;
sample = alpha * sample + (1.0f - alpha) * prev_sample;
prev_sample = sample;
convert_one_component(
(u8*)frames+i*frame_size,
format.bit_width,
&sample,
AUDIO_BITS_32
);
}
}
void apply_audio_spacialization(void* frames, Audio_Format format, u64 number_of_frames, Vector3 pos) {
if (format.channels == 1) {
apply_audio_spacialization_mono(frames, format, number_of_frames, pos);
}
float32 distance = sqrtf(pos.x * pos.x + pos.y * pos.y + pos.z * pos.z);
float32 attenuation = 1.0f / (1.0f + distance);
float32 left_right_pan = (pos.x + 1.0f) * 0.5f;
float32 up_down_pan = (pos.y + 1.0f) * 0.5f;
float32 front_back_pan = (pos.z + 1.0f) * 0.5f;
u64 comp_size = get_audio_bit_width_byte_size(format.bit_width);
u64 frame_size = comp_size * format.channels;
float32 high_pass_coeff = 0.8f + 0.2f * up_down_pan;
float32 low_pass_coeff = 1.0f - high_pass_coeff;
// Apply gains to each frame
for (u64 i = 0; i < number_of_frames; ++i) {
for (u64 c = 0; c < format.channels; ++c) {
// Convert whatever to float32 -1 to 1
float32 sample;
convert_one_component(
&sample,
AUDIO_BITS_32,
(u8*)frames+i*frame_size+c*comp_size,
format.bit_width
);
float32 gain = 1.0f / format.channels;
if (format.channels == 2) {
// time delay and phase shift for vertical position
float32 phase_shift = (up_down_pan - 0.5f) * 0.5f; // 0.5 radians phase shift range
// Stereo
if (c == 0) {
gain = (1.0f - left_right_pan) * attenuation;
sample = sample * cos(phase_shift) - sample * sin(phase_shift);
} else if (c == 1) {
gain = left_right_pan * attenuation;
sample = sample * cos(phase_shift) + sample * sin(phase_shift);
}
} else if (format.channels == 4) {
// Quadraphonic sound (left-right, front-back)
if (c == 0) {
gain = (1.0f - left_right_pan) * (1.0f - front_back_pan) * attenuation;
} else if (c == 1) {
gain = left_right_pan * (1.0f - front_back_pan) * attenuation;
} else if (c == 2) {
gain = (1.0f - left_right_pan) * front_back_pan * attenuation;
} else if (c == 3) {
gain = left_right_pan * front_back_pan * attenuation;
}
} else if (format.channels == 6) {
// 5.1 surround sound (left, right, center, LFE, rear left, rear right)
if (c == 0) {
gain = (1.0f - left_right_pan) * attenuation;
} else if (c == 1) {
gain = left_right_pan * attenuation;
} else if (c == 2) {
gain = (1.0f - front_back_pan) * attenuation;
} else if (c == 3) {
gain = 0.5f * attenuation; // LFE (subwoofer) channel
} else if (c == 4) {
gain = (1.0f - left_right_pan) * front_back_pan * attenuation;
} else if (c == 5) {
gain = left_right_pan * front_back_pan * attenuation;
}
} else {
// No idea what device this is, just distribute equally
gain = attenuation / format.channels;
}
sample *= gain;
// Convert back to whatever
convert_one_component(
(u8*)frames+i*frame_size+c*comp_size,
format.bit_width,
&sample,
AUDIO_BITS_32
);
}
}
}
void apply_audio_volume(void* frames, Audio_Format format, u64 number_of_frames, float32 vol) {
// #Speed
// This is lazy, also it can be combined with other passes.
u64 comp_size = get_audio_bit_width_byte_size(format.bit_width);
u64 frame_size = comp_size * format.channels;
if (vol <= 0.0) {
memset(frames, 0, frame_size*number_of_frames);
}
for (u64 i = 0; i < number_of_frames; ++i) {
for (u64 c = 0; c < format.channels; ++c) {
float32 sample;
convert_one_component(
&sample,
AUDIO_BITS_32,
(u8*)frames+i*frame_size+c*comp_size,
format.bit_width
);
sample *= vol;
convert_one_component(
(u8*)frames+i*frame_size+c*comp_size,
format.bit_width,
&sample,
AUDIO_BITS_32
);
}
}
}
// This is supposed to be called by OS layer audio thread whenever it wants more audio samples // This is supposed to be called by OS layer audio thread whenever it wants more audio samples
void void
do_program_audio_sample(u64 number_of_output_frames, Audio_Format out_format, do_program_audio_sample(u64 number_of_output_frames, Audio_Format out_format,
@ -1581,6 +1757,13 @@ do_program_audio_sample(u64 number_of_output_frames, Audio_Format out_format,
assert(converted == number_of_output_frames); assert(converted == number_of_output_frames);
} }
if (!p->disable_spacialization) {
apply_audio_spacialization(mix_buffer, out_format, number_of_output_frames, p->position);
}
if (p->volume != 0.0) {
apply_audio_volume(mix_buffer, out_format, number_of_output_frames, p->volume);
}
mix_frames(output, mix_buffer, number_of_output_frames, out_format); mix_frames(output, mix_buffer, number_of_output_frames, out_format);
mutex_release(&src.mutex_for_destroy); mutex_release(&src.mutex_for_destroy);

View file

@ -18,9 +18,11 @@ void dump_stack_trace();
#define DEFER(start, end) for(int _i_ = ((start), 0); _i_ == 0; _i_ += 1, (end)) #define DEFER(start, end) for(int _i_ = ((start), 0); _i_ == 0; _i_ += 1, (end))
#define RAW_STRING(...) (#__VA_ARGS__)
#if CONFIGURATION == RELEASE #if CONFIGURATION == RELEASE
#undef assert #undef assert
#define assert(...) (void)0; #define assert(x, ...) (void)(x)
#endif #endif
#define panic(...) { print(__VA_ARGS__); crash(); } #define panic(...) { print(__VA_ARGS__); crash(); }

View file

@ -34,7 +34,7 @@ typedef struct Cpu_Capabilities {
__debugbreak(); __debugbreak();
volatile int *a = 0; volatile int *a = 0;
*a = 5; *a = 5;
a = (int*)0xDEADBEEF; a = (volatile int*)0xDEADBEEF;
*a = 5; *a = 5;
} }
#include <intrin.h> #include <intrin.h>

File diff suppressed because it is too large Load diff

View file

@ -1,203 +0,0 @@
struct VS_INPUT
{
float4 position : POSITION;
float2 uv : TEXCOORD;
float4 color : COLOR;
int data1: DATA1_;
// s8 texture_index
// u8 type
// u8 sampler_index
// u8
};
struct PS_INPUT
{
float4 position : SV_POSITION;
float2 uv : TEXCOORD0;
float4 color : COLOR;
int texture_index: TEXTURE_INDEX;
int type: TYPE;
int sampler_index: SAMPLER_INDEX;
};
PS_INPUT vs_main(VS_INPUT input)
{
PS_INPUT output;
output.position = input.position;
output.uv = input.uv;
output.color = input.color;
output.texture_index = (input.data1) & 0xFF;
output.type = (input.data1 >> 8) & 0xFF;
output.sampler_index = (input.data1 >> 16) & 0xFF;
return output;
}
// #Magicvalue
Texture2D textures[32] : register(t0);
SamplerState image_sampler_0 : register(s0);
SamplerState image_sampler_1 : register(s1);
SamplerState image_sampler_2 : register(s2);
SamplerState image_sampler_3 : register(s3);
float4 sample_texture(int texture_index, int sampler_index, float2 uv) {
// I love hlsl
if (sampler_index == 0) {
if (texture_index == 0) return textures[0].Sample(image_sampler_0, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_0, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_0, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_0, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_0, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_0, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_0, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_0, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_0, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_0, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_0, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_0, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_0, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_0, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_0, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_0, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_0, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_0, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_0, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_0, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_0, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_0, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_0, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_0, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_0, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_0, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_0, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_0, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_0, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_0, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_0, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_0, uv);
} else if (sampler_index == 1) {
if (texture_index == 0) return textures[0].Sample(image_sampler_1, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_1, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_1, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_1, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_1, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_1, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_1, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_1, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_1, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_1, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_1, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_1, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_1, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_1, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_1, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_1, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_1, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_1, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_1, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_1, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_1, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_1, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_1, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_1, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_1, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_1, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_1, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_1, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_1, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_1, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_1, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_1, uv);
} else if (sampler_index == 2) {
if (texture_index == 0) return textures[0].Sample(image_sampler_2, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_2, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_2, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_2, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_2, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_2, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_2, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_2, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_2, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_2, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_2, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_2, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_2, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_2, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_2, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_2, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_2, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_2, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_2, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_2, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_2, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_2, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_2, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_2, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_2, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_2, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_2, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_2, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_2, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_2, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_2, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_2, uv);
} else if (sampler_index == 3) {
if (texture_index == 0) return textures[0].Sample(image_sampler_3, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_3, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_3, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_3, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_3, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_3, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_3, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_3, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_3, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_3, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_3, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_3, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_3, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_3, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_3, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_3, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_3, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_3, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_3, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_3, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_3, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_3, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_3, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_3, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_3, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_3, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_3, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_3, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_3, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_3, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_3, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_3, uv);
}
return float4(1.0, 0.0, 0.0, 1.0);
}
#define QUAD_TYPE_REGULAR 0
#define QUAD_TYPE_TEXT 1
float4 ps_main(PS_INPUT input) : SV_TARGET
{
if (input.type == QUAD_TYPE_REGULAR) {
if (input.texture_index >= 0 && input.texture_index < 32 && input.sampler_index >= 0 && input.sampler_index <= 3) {
return sample_texture(input.texture_index, input.sampler_index, input.uv)*input.color;
} else {
return input.color;
}
} else if (input.type == QUAD_TYPE_TEXT) {
if (input.texture_index >= 0 && input.texture_index < 32 && input.sampler_index >= 0 && input.sampler_index <= 3) {
float alpha = sample_texture(input.texture_index, input.sampler_index, input.uv).x;
return float4(1.0, 1.0, 1.0, alpha)*input.color;
} else {
return input.color;
}
}
return float4(0.0, 1.0, 0.0, 1.0);
}

View file

@ -1,92 +1,48 @@
/* /*
Usage: void push_z_layer(s32 z);
void pop_z_layer();
Just call draw_xxx procedures anywhere in the frame when you want something to be drawn that frame. void push_window_scissor(Vector2 min, Vector2 max);
void pop_window_scissor();
// Examples:
// Verbose
Draw_Quad quad;
quad.bottom_left = v2(x, y);
quad.top_left = v2(x, y);
quad.top_right = v2(x, y);
quad.bottom_right = v2(x, y);
quad.color = v4(r, g, b, a);
quad.image = my_image; // ZERO(Gfx_Image) To draw a plain color
quad.uv = v4(0, 0, 1, 1);
draw_quad(quad);
// Basic rect. Bottom left at X=-0.25, Y=-0.5 with a size of W=0.5, H=0.5
draw_rect(v2(-0.25, -0.5), v2(0.5, 0.5), COLOR_GREEN);
// Rotated rect. Bottom left at X=-0.25, Y=-0.5 with a size of W=0.5, H=0.5
// With a centered pivot (half size) and a rotation of 2.4 radians
// If pivot is v2(0, 0), the rectangle will rotate around it's bottom left.
draw_rect_rotated(v2(-0.25, -0.5), v2(0.5, 0.5), COLOR_GREEN, v2(0.25, 0.25), 2.4f);
// Basic image. Bottom left at X=-0.25, Y=-0.5 with a size of W=0.5, H=0.5
draw_image(v2(-0.25, -0.5), v2(0.5, 0.5), COLOR_GREEN);
// Rotated image. Bottom left at X=-0.25, Y=-0.5 with a size of W=0.5, H=0.5
// With a centered pivot (half size) and a rotation of 2.4 radians
// If pivot is v2(0, 0), the rectangle will rotate around it's bottom left.
draw_image_rotated(v2(-0.25, -0.5), v2(0.5, 0.5), COLOR_GREEN, v2(0.25, 0.25), 2.4f);
// Loading an image (png only)
Gfx_Image image = load_image_from_disk("my_image.png");
if (!image.data) {
// We failed loading the image.
}
// If you ever need to free the image:
delete_image(image);
API:
// !! IMPORTANT
// The Draw_Quad* returned from draw procedures is a temporary pointer and may be
// invalid after the next draw_xxxx call. This is because quads are stored in a
// resizing buffer (because that gave us a non-trivial performance boost).
// So the purpose of returning them is to customize the quad right after the draw proc.
Draw_Quad *draw_rect(Vector2 position, Vector2 size, Vector4 color);
Draw_Quad *draw_rect_xform(Matrix4 xform, Vector2 size, Vector4 color);
Draw_Quad *draw_circle(Vector2 position, Vector2 size, Vector4 color);
Draw_Quad *draw_circle_xform(Matrix4 xform, Vector2 size, Vector4 color);
Draw_Quad *draw_image(Gfx_Image *image, Vector2 position, Vector2 size, Vector4 color);
Draw_Quad *draw_image_xform(Gfx_Image *image, Matrix4 xform, Vector2 size, Vector4 color);
Draw_Quad *draw_quad_projected(Draw_Quad quad, Matrix4 world_to_clip); Draw_Quad *draw_quad_projected(Draw_Quad quad, Matrix4 world_to_clip);
Draw_Quad *draw_quad(Draw_Quad quad); Draw_Quad *draw_quad(Draw_Quad quad);
Draw_Quad *draw_quad_xform(Draw_Quad quad, Matrix4 xform); Draw_Quad *draw_quad_xform(Draw_Quad quad, Matrix4 xform);
Draw_Quad *draw_rect(Vector2 position, Vector2 size, Vector4 color); bool draw_text_callback(Gfx_Glyph glyph, Gfx_Font_Atlas *atlas, float glyph_x, float glyph_y, void *ud);
Draw_Quad *draw_rect_xform(Matrix4 xform, Vector2 size, Vector4 color); void draw_text_xform(Gfx_Font *font, string text, u32 raster_height, Matrix4 xform, Vector2 scale, Vector4 color);
Draw_Quad *draw_image(Gfx_Image *image, Vector2 position, Vector2 size, Vector4 color);
Draw_Quad *draw_image_xform(Gfx_Image *image, Matrix4 xform, Vector2 size, Vector4 color);
// raster_height is the pixel height that the text will be rasterized at. If text is blurry,
// you can try to increase raster_height and lower scale.
void draw_text(Gfx_Font *font, string text, u32 raster_height, Vector2 position, Vector2 scale, Vector4 color); void draw_text(Gfx_Font *font, string text, u32 raster_height, Vector2 position, Vector2 scale, Vector4 color);
void draw_text_xform(Gfx_Font *font, string text, u32 raster_height, Matrix4 xform, Vector4 color); Gfx_Text_Metrics draw_text_and_measure(Gfx_Font *font, string text, u32 raster_height, Vector2 position, Vector2 scale, Vector4 color);
void draw_line(Vector2 p0, Vector2 p1, float line_width, Vector4 color);
*/ */
// We use radix sort so the exact bit count is of importance // We use radix sort so the exact bit count is of importance
#define MAX_Z_BITS 21 #define MAX_Z_BITS 21
#define MAX_Z ((1 << MAX_Z_BITS)/2) #define MAX_Z ((1 << MAX_Z_BITS)/2)
#define Z_STACK_MAX 4096 #define Z_STACK_MAX 4096
#define SCISSOR_STACK_MAX 4096
typedef struct Draw_Quad { typedef struct Draw_Quad {
// BEWARE !! These are in ndc
Vector2 bottom_left, top_left, top_right, bottom_right; Vector2 bottom_left, top_left, top_right, bottom_right;
// r, g, b, a // r, g, b, a
Vector4 color; Vector4 color;
Gfx_Image *image; Gfx_Image *image;
// x1, y1, x2, y2
Vector4 uv;
u8 type;
Gfx_Filter_Mode image_min_filter; Gfx_Filter_Mode image_min_filter;
Gfx_Filter_Mode image_mag_filter; Gfx_Filter_Mode image_mag_filter;
s32 z; s32 z;
u8 type;
bool has_scissor;
// x1, y1, x2, y2
Vector4 uv;
Vector4 scissor;
Vector4 userdata[VERTEX_2D_USER_DATA_COUNT]; // #Volatile do NOT change this to a pointer
} Draw_Quad; } Draw_Quad;
@ -102,6 +58,12 @@ typedef struct Draw_Frame {
bool enable_z_sorting; bool enable_z_sorting;
s32 z_stack[Z_STACK_MAX]; s32 z_stack[Z_STACK_MAX];
u64 z_count; u64 z_count;
Vector4 scissor_stack[SCISSOR_STACK_MAX];
u64 scissor_count;
void *cbuffer;
} Draw_Frame; } Draw_Frame;
// This frame is passed to the platform layer and rendered in os_update. // This frame is passed to the platform layer and rendered in os_update.
// Resets every frame. // Resets every frame.
@ -127,18 +89,49 @@ void pop_z_layer() {
draw_frame.z_count -= 1; draw_frame.z_count -= 1;
} }
void push_window_scissor(Vector2 min, Vector2 max) {
assert(draw_frame.scissor_count < SCISSOR_STACK_MAX, "Too many scissors pushed. You can pop with pop_window_scissor() when you are done drawing to it.");
draw_frame.scissor_stack[draw_frame.scissor_count] = v4(min.x, min.y, max.x, max.y);
draw_frame.scissor_count += 1;
}
void pop_window_scissor() {
assert(draw_frame.scissor_count > 0, "No scissors to pop!");
draw_frame.scissor_count -= 1;
}
Draw_Quad _nil_quad = {0};
Draw_Quad *draw_quad_projected(Draw_Quad quad, Matrix4 world_to_clip) { Draw_Quad *draw_quad_projected(Draw_Quad quad, Matrix4 world_to_clip) {
quad.bottom_left = m4_transform(world_to_clip, v4(v2_expand(quad.bottom_left), 0, 1)).xy; quad.bottom_left = m4_transform(world_to_clip, v4(v2_expand(quad.bottom_left), 0, 1)).xy;
quad.top_left = m4_transform(world_to_clip, v4(v2_expand(quad.top_left), 0, 1)).xy; quad.top_left = m4_transform(world_to_clip, v4(v2_expand(quad.top_left), 0, 1)).xy;
quad.top_right = m4_transform(world_to_clip, v4(v2_expand(quad.top_right), 0, 1)).xy; quad.top_right = m4_transform(world_to_clip, v4(v2_expand(quad.top_right), 0, 1)).xy;
quad.bottom_right = m4_transform(world_to_clip, v4(v2_expand(quad.bottom_right), 0, 1)).xy; quad.bottom_right = m4_transform(world_to_clip, v4(v2_expand(quad.bottom_right), 0, 1)).xy;
bool should_cull =
(quad.bottom_left.x < -1 && quad.top_left.x < -1 && quad.top_right.x < -1 && quad.bottom_right.x < -1) ||
(quad.bottom_left.x > 1 && quad.top_left.x > 1 && quad.top_right.x > 1 && quad.bottom_right.x > 1) ||
(quad.bottom_left.y < -1 && quad.top_left.y < -1 && quad.top_right.y < -1 && quad.bottom_right.y < -1) ||
(quad.bottom_left.y > 1 && quad.top_left.y > 1 && quad.top_right.y > 1 && quad.bottom_right.y > 1);
if (should_cull) {
return &_nil_quad;
}
quad.image_min_filter = GFX_FILTER_MODE_NEAREST; quad.image_min_filter = GFX_FILTER_MODE_NEAREST;
quad.image_mag_filter = GFX_FILTER_MODE_NEAREST; quad.image_mag_filter = GFX_FILTER_MODE_NEAREST;
quad.z = 0; quad.z = 0;
if (draw_frame.z_count > 0) quad.z = draw_frame.z_stack[draw_frame.z_count-1]; if (draw_frame.z_count > 0) quad.z = draw_frame.z_stack[draw_frame.z_count-1];
quad.has_scissor = false;
if (draw_frame.scissor_count > 0) {
quad.scissor = draw_frame.scissor_stack[draw_frame.scissor_count-1];
quad.has_scissor = true;
}
memset(quad.userdata, 0, sizeof(quad.userdata));
if (draw_frame.num_quads >= allocated_quads) { if (draw_frame.num_quads >= allocated_quads) {
// #Memory // #Memory
@ -173,6 +166,7 @@ Draw_Quad *draw_quad_xform(Draw_Quad quad, Matrix4 xform) {
} }
Draw_Quad *draw_rect(Vector2 position, Vector2 size, Vector4 color) { Draw_Quad *draw_rect(Vector2 position, Vector2 size, Vector4 color) {
// #Copypaste #Volatile
const float32 left = position.x; const float32 left = position.x;
const float32 right = position.x + size.x; const float32 right = position.x + size.x;
const float32 bottom = position.y; const float32 bottom = position.y;
@ -190,6 +184,7 @@ Draw_Quad *draw_rect(Vector2 position, Vector2 size, Vector4 color) {
return draw_quad(q); return draw_quad(q);
} }
Draw_Quad *draw_rect_xform(Matrix4 xform, Vector2 size, Vector4 color) { Draw_Quad *draw_rect_xform(Matrix4 xform, Vector2 size, Vector4 color) {
// #Copypaste #Volatile
Draw_Quad q = ZERO(Draw_Quad); Draw_Quad q = ZERO(Draw_Quad);
q.bottom_left = v2(0, 0); q.bottom_left = v2(0, 0);
q.top_left = v2(0, size.y); q.top_left = v2(0, size.y);
@ -201,6 +196,37 @@ Draw_Quad *draw_rect_xform(Matrix4 xform, Vector2 size, Vector4 color) {
return draw_quad_xform(q, xform); return draw_quad_xform(q, xform);
} }
Draw_Quad *draw_circle(Vector2 position, Vector2 size, Vector4 color) {
// #Copypaste #Volatile
const float32 left = position.x;
const float32 right = position.x + size.x;
const float32 bottom = position.y;
const float32 top = position.y+size.y;
Draw_Quad q;
q.bottom_left = v2(left, bottom);
q.top_left = v2(left, top);
q.top_right = v2(right, top);
q.bottom_right = v2(right, bottom);
q.color = color;
q.image = 0;
q.type = QUAD_TYPE_CIRCLE;
return draw_quad(q);
}
Draw_Quad *draw_circle_xform(Matrix4 xform, Vector2 size, Vector4 color) {
// #Copypaste #Volatile
Draw_Quad q = ZERO(Draw_Quad);
q.bottom_left = v2(0, 0);
q.top_left = v2(0, size.y);
q.top_right = v2(size.x, size.y);
q.bottom_right = v2(size.x, 0);
q.color = color;
q.image = 0;
q.type = QUAD_TYPE_CIRCLE;
return draw_quad_xform(q, xform);
}
Draw_Quad *draw_image(Gfx_Image *image, Vector2 position, Vector2 size, Vector4 color) { Draw_Quad *draw_image(Gfx_Image *image, Vector2 position, Vector2 size, Vector4 color) {
Draw_Quad *q = draw_rect(position, size, color); Draw_Quad *q = draw_rect(position, size, color);
@ -272,6 +298,16 @@ Gfx_Text_Metrics draw_text_and_measure(Gfx_Font *font, string text, u32 raster_h
return measure_text(font, text, raster_height, scale); return measure_text(font, text, raster_height, scale);
} }
void draw_line(Vector2 p0, Vector2 p1, float line_width, Vector4 color) {
Vector2 dir = v2(p1.x - p0.x, p1.y - p0.y);
float length = sqrt(dir.x * dir.x + dir.y * dir.y);
float r = atan2(-dir.y, dir.x);
Matrix4 line_xform = m4_scalar(1);
line_xform = m4_translate(line_xform, v3(p0.x, p0.y, 0));
line_xform = m4_rotate_z(line_xform, r);
line_xform = m4_translate(line_xform, v3(0, -line_width/2, 0));
draw_rect_xform(line_xform, v2(length, line_width), color);
}
#define COLOR_RED ((Vector4){1.0, 0.0, 0.0, 1.0}) #define COLOR_RED ((Vector4){1.0, 0.0, 0.0, 1.0})
#define COLOR_GREEN ((Vector4){0.0, 1.0, 0.0, 1.0}) #define COLOR_GREEN ((Vector4){0.0, 1.0, 0.0, 1.0})

View file

@ -57,8 +57,13 @@ int entry(int argc, char **argv) {
draw_frame.projection = m4_make_orthographic_projection(window.pixel_width * -0.5, window.pixel_width * 0.5, window.pixel_height * -0.5, window.pixel_height * 0.5, -1, 10); draw_frame.projection = m4_make_orthographic_projection(window.pixel_width * -0.5, window.pixel_width * 0.5, window.pixel_height * -0.5, window.pixel_height * 0.5, -1, 10);
if (is_key_just_pressed(MOUSE_BUTTON_RIGHT)) { if (is_key_just_pressed(MOUSE_BUTTON_RIGHT)) {
float mx = input_frame.mouse_x;
float my = input_frame.mouse_y;
// Easy mode (when you don't care and just want to play a clip) // Easy mode (when you don't care and just want to play a clip)
play_one_audio_clip(STR("oogabooga/examples/block.wav")); Vector3 p = v3(mx/(f32)window.width*2.0-1, my/(f32)window.height*2.0-1, 0);
log("%f, %f", p.x, p.y);
play_one_audio_clip_at_position(STR("oogabooga/examples/block.wav"), p);
// Or just play_one_audio_clip if you don't care about spacialization
} }
@ -66,7 +71,7 @@ int entry(int argc, char **argv) {
Vector4 rect; Vector4 rect;
rect.x = -window.width/2+40; rect.x = -window.width/2+40;
rect.y = window.height/2-FONT_HEIGHT-40; rect.y = window.height/2-FONT_HEIGHT-40;
rect.z = FONT_HEIGHT*5; rect.z = FONT_HEIGHT*8;
rect.w = FONT_HEIGHT*1.5; rect.w = FONT_HEIGHT*1.5;
bool clip_playing = clip_player->state == AUDIO_PLAYER_STATE_PLAYING; bool clip_playing = clip_player->state == AUDIO_PLAYER_STATE_PLAYING;
@ -91,7 +96,23 @@ int entry(int argc, char **argv) {
audio_player_set_progression_factor(song_player, 0); audio_player_set_progression_factor(song_player, 0);
} }
rect.y = window.height/2-FONT_HEIGHT-40;
rect.x += rect.z + FONT_HEIGHT;
if (button(STR("Song vol up"), rect.xy, rect.zw, false)) {
song_player->volume += 0.05;
}
rect.y -= FONT_HEIGHT*1.8;
if (button(STR("Song vol down"), rect.xy, rect.zw, false)) {
song_player->volume -= 0.05;
}
song_player->volume = clamp(song_player->volume, 0, 20);
rect.x += rect.z + FONT_HEIGHT;
draw_text(font, tprint("Song volume: %d%%", (s64)round(song_player->volume*100)), FONT_HEIGHT, v2_sub(rect.xy, v2(2, -2)), v2(1, 1), COLOR_BLACK);
draw_text(font, tprint("Song volume: %d%%", (s64)round(song_player->volume*100)), FONT_HEIGHT, rect.xy, v2(1, 1), COLOR_WHITE);
rect.y -= FONT_HEIGHT*3; rect.y -= FONT_HEIGHT*3;
draw_text(font, STR("Right-click for thing"), FONT_HEIGHT, v2_sub(rect.xy, v2(2, -2)), v2(1, 1), COLOR_BLACK); draw_text(font, STR("Right-click for thing"), FONT_HEIGHT, v2_sub(rect.xy, v2(2, -2)), v2(1, 1), COLOR_BLACK);
draw_text(font, STR("Right-click for thing"), FONT_HEIGHT, rect.xy, v2(1, 1), COLOR_WHITE); draw_text(font, STR("Right-click for thing"), FONT_HEIGHT, rect.xy, v2(1, 1), COLOR_WHITE);

View file

@ -0,0 +1,116 @@
// BEWARE std140 packing:
// https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-packing-rules
typedef struct My_Cbuffer {
Vector2 mouse_pos_screen; // We use this to make a light around the mouse cursor
Vector2 window_size; // We only use this to revert the Y in the shader because for some reason d3d11 inverts it.
} My_Cbuffer;
// We implement these details which we implement in the shader
#define DETAIL_TYPE_ROUNDED_CORNERS 1
#define DETAIL_TYPE_OUTLINED 2
// With custom shading we can extend the rendering library!
Draw_Quad *draw_rounded_rect(Vector2 p, Vector2 size, Vector4 color, float radius);
Draw_Quad *draw_rounded_rect_xform(Matrix4 xform, Vector2 size, Vector4 color, float radius);
Draw_Quad *draw_outlined_rect(Vector2 p, Vector2 size, Vector4 color, float line_width);
Draw_Quad *draw_outlined_rect_xform(Matrix4 xform, Vector2 size, Vector4 color, float line_width);
int entry(int argc, char **argv) {
window.title = STR("Custom Shader Example");
window.scaled_width = 1280;
window.scaled_height = 720;
window.x = 200;
window.y = 90;
window.clear_color = hex_to_rgba(0x6495EDff);
string source;
bool ok = os_read_entire_file("oogabooga/examples/custom_shader.hlsl", &source, get_heap_allocator());
assert(ok, "Could not read oogabooga/examples/custom_shader.hlsl");
// This is slow and needs to recompile the shader. However, it should probably only happen once (or each hot reload)
// If it fails, it will return false and return to whatever shader it was before.
shader_recompile_with_extension(source, sizeof(My_Cbuffer));
dealloc_string(get_heap_allocator(), source);
// This memory needs to stay alive throughout the frame because we pass the pointer to it in draw_frame.cbuffer.
// If this memory is invalidated before gfx_update after setting draw_frame.cbuffer, then gfx_update will copy
// memory from an invalid address.
My_Cbuffer cbuffer;
float64 last_time = os_get_current_time_in_seconds();
while (!window.should_close) {
float64 now = os_get_current_time_in_seconds();
if ((int)now != (int)last_time) {
log("%.2f FPS\n%.2fms", 1.0/(now-last_time), (now-last_time)*1000);
}
last_time = now;
reset_temporary_storage();
cbuffer.mouse_pos_screen = v2(input_frame.mouse_x, input_frame.mouse_y);
cbuffer.window_size = v2(window.width, window.height);
draw_frame.cbuffer = &cbuffer;
// Just draw a big rect to cover background, so our lighting shader will apply to background
draw_rect(v2(-5, -5), v2(10, 10), v4(.4, .4, .4, 1.0));
Matrix4 rect_xform = m4_scalar(1.0);
rect_xform = m4_rotate_z(rect_xform, (f32)now);
rect_xform = m4_translate(rect_xform, v3(-.25f, -.25f, 0));
Draw_Quad *q = draw_rounded_rect_xform(rect_xform, v2(.5f, .5f), COLOR_GREEN, 0.1);
draw_outlined_rect(v2(sin(now), -.8), v2(.5, .25), COLOR_RED, 15);
// Shader hot reloading
if (is_key_just_pressed('R')) {
ok = os_read_entire_file("oogabooga/examples/custom_shader.hlsl", &source, get_heap_allocator());
assert(ok, "Could not read oogabooga/examples/custom_shader.hlsl");
shader_recompile_with_extension(source, sizeof(My_Cbuffer));
dealloc_string(get_heap_allocator(), source);
}
os_update();
gfx_update();
}
return 0;
}
Draw_Quad *draw_rounded_rect(Vector2 p, Vector2 size, Vector4 color, float radius) {
Draw_Quad *q = draw_rect(p, size, color);
// detail_type
q->userdata[0].x = DETAIL_TYPE_ROUNDED_CORNERS;
// corner_radius
q->userdata[0].y = radius;
return q;
}
Draw_Quad *draw_rounded_rect_xform(Matrix4 xform, Vector2 size, Vector4 color, float radius) {
Draw_Quad *q = draw_rect_xform(xform, size, color);
// detail_type
q->userdata[0].x = DETAIL_TYPE_ROUNDED_CORNERS;
// corner_radius
q->userdata[0].y = radius;
return q;
}
Draw_Quad *draw_outlined_rect(Vector2 p, Vector2 size, Vector4 color, float line_width) {
Draw_Quad *q = draw_rect(p, size, color);
// detail_type
q->userdata[0].x = DETAIL_TYPE_OUTLINED;
// line_width
q->userdata[0].y = line_width;
return q;
}
Draw_Quad *draw_outlined_rect_xform(Matrix4 xform, Vector2 size, Vector4 color, float line_width) {
Draw_Quad *q = draw_rect_xform(xform, size, color);
// detail_type
q->userdata[0].x = DETAIL_TYPE_OUTLINED;
// line_width
q->userdata[0].y = line_width;
return q;
}

View file

@ -0,0 +1,68 @@
// PS_INPUT is defined in the default shader in gfx_impl_d3d11.c at the bottom of the file
// BEWARE std140 packing:
// https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-packing-rules
cbuffer some_cbuffer : register(b0) {
float2 mouse_pos_screen; // In pixels
float2 window_size;
}
#define DETAIL_TYPE_ROUNDED_CORNERS 1
#define DETAIL_TYPE_OUTLINED 2
float4 get_light_contribution(PS_INPUT input) {
const float light_distance = 500; // We could pass this with userdata
float2 vertex_pos = input.position_screen.xy; // In pixels
vertex_pos.y = window_size.y-vertex_pos.y; // For some reason d3d11 inverts the Y here so we need to revert it
// Simple linear attenuation based on distance
float attenuation = 1.0 - (length(mouse_pos_screen - vertex_pos) / light_distance);
return float4(attenuation, attenuation, attenuation, 1.0);
}
// This procedure is the "entry" of our extension to the shader
// It basically just takes in the resulting color and input from vertex shader, for us to transform it
// however we want.
float4 pixel_shader_extension(PS_INPUT input, float4 color) {
float detail_type = input.userdata[0].x;
if (detail_type == DETAIL_TYPE_ROUNDED_CORNERS) {
float corner_radius = input.userdata[0].y;
float2 pos = input.self_uv - float2(0.5, 0.5);
float2 corner_distance = abs(pos) - (float2(0.5, 0.5) - corner_radius);
float dist = length(max(corner_distance, 0.0)) - corner_radius;
float smoothing = 0.01;
float mask = 1.0-smoothstep(0.0, smoothing, dist);
color *= mask;
} else if (detail_type == DETAIL_TYPE_OUTLINED) {
float line_width = input.userdata[0].y;
float2 pixel_pos = round(input.self_uv*window_size);
float xcenter = window_size.x/2;
float ycenter = window_size.y/2;
float xedge = pixel_pos.x < xcenter ? 0.0 : window_size.x;
float yedge = pixel_pos.y < ycenter ? 0.0 : window_size.y;
float xdist = abs(xedge-pixel_pos.x);
float ydist = abs(yedge-pixel_pos.y);
if (xdist >= line_width && ydist >= line_width) {
discard;
}
}
float4 light = get_light_contribution(input);
return color * light;
}

View file

@ -8,7 +8,12 @@ int entry(int argc, char **argv) {
window.y = 90; window.y = 90;
window.clear_color = hex_to_rgba(0x6495EDff); window.clear_color = hex_to_rgba(0x6495EDff);
float64 last_time = os_get_current_time_in_seconds();
while (!window.should_close) { while (!window.should_close) {
float64 now = os_get_current_time_in_seconds();
if ((int)now != (int)last_time) log("%.2f FPS\n%.2fms", 1.0/(now-last_time), (now-last_time)*1000);
last_time = now;
reset_temporary_storage(); reset_temporary_storage();
float64 now = os_get_current_time_in_seconds(); float64 now = os_get_current_time_in_seconds();
@ -19,6 +24,12 @@ int entry(int argc, char **argv) {
draw_rect(v2(sin(now), -.8), v2(.5, .25), COLOR_RED); draw_rect(v2(sin(now), -.8), v2(.5, .25), COLOR_RED);
float aspect = (f32)window.width/(f32)window.height;
float mx = (input_frame.mouse_x/(f32)window.width * 2.0 - 1.0)*aspect;
float my = input_frame.mouse_y/(f32)window.height * 2.0 - 1.0;
draw_line(v2(-.75, -.75), v2(mx, my), 0.005, COLOR_WHITE);
os_update(); os_update();
gfx_update(); gfx_update();
} }

View file

@ -28,7 +28,7 @@ int entry(int argc, char **argv) {
render_atlas_if_not_yet_rendered(font, 32, 'A'); render_atlas_if_not_yet_rendered(font, 32, 'A');
seed_for_random = os_get_current_cycle_count(); seed_for_random = rdtsc();
const float64 fps_limit = 69000; const float64 fps_limit = 69000;
const float64 min_frametime = 1.0 / fps_limit; const float64 min_frametime = 1.0 / fps_limit;
@ -91,8 +91,15 @@ int entry(int argc, char **argv) {
draw_frame.enable_z_sorting = do_enable_z_sorting; draw_frame.enable_z_sorting = do_enable_z_sorting;
if (is_key_just_pressed('Z')) do_enable_z_sorting = !do_enable_z_sorting; if (is_key_just_pressed('Z')) do_enable_z_sorting = !do_enable_z_sorting;
if (do_enable_z_sorting) {
push_window_scissor(
v2(input_frame.mouse_x-256, input_frame.mouse_y-256),
v2(input_frame.mouse_x+256, input_frame.mouse_y+256)
);
}
seed_for_random = 69; seed_for_random = 69;
for (u64 i = 0; i < 50000; i++) { for (u64 i = 0; i < 30000; i++) {
float32 aspect = (float32)window.width/(float32)window.height; float32 aspect = (float32)window.width/(float32)window.height;
float min_x = -aspect; float min_x = -aspect;
float max_x = aspect; float max_x = aspect;
@ -106,7 +113,7 @@ int entry(int argc, char **argv) {
draw_image(bush_image, v2(x, y), v2(0.1, 0.1), COLOR_WHITE); draw_image(bush_image, v2(x, y), v2(0.1, 0.1), COLOR_WHITE);
pop_z_layer(); pop_z_layer();
} }
seed_for_random = os_get_current_cycle_count(); seed_for_random = rdtsc();
Matrix4 hammer_xform = m4_scalar(1.0); Matrix4 hammer_xform = m4_scalar(1.0);
hammer_xform = m4_rotate_z(hammer_xform, (f32)now); hammer_xform = m4_rotate_z(hammer_xform, (f32)now);
@ -117,7 +124,7 @@ int entry(int argc, char **argv) {
Vector2 hover_position = v2_rotate_point_around_pivot(v2(-.5, -.5), v2(0, 0), (f32)now); Vector2 hover_position = v2_rotate_point_around_pivot(v2(-.5, -.5), v2(0, 0), (f32)now);
Vector2 local_pivot = v2(.125f, .125f); Vector2 local_pivot = v2(.125f, .125f);
draw_rect(v2_sub(hover_position, local_pivot), v2(.25f, .25f), v4((sin(now)+1.0)/2.0, 1.0, 0.0, 1.0)); draw_circle(v2_sub(hover_position, local_pivot), v2(.25f, .25f), v4((sin(now)+1.0)/2.0, 1.0, 0.0, 1.0));
draw_image(bush_image, v2(0.65, 0.65), v2(0.2*sin(now), 0.2*sin(now)), COLOR_WHITE); draw_image(bush_image, v2(0.65, 0.65), v2(0.2*sin(now), 0.2*sin(now)), COLOR_WHITE);
@ -134,6 +141,10 @@ int entry(int argc, char **argv) {
if (show) draw_image(atlas->image, v2(-1.6, -1), v2(4, 4), COLOR_WHITE); if (show) draw_image(atlas->image, v2(-1.6, -1), v2(4, 4), COLOR_WHITE);
if (do_enable_z_sorting) {
pop_window_scissor();
}
tm_scope("gfx_update") { tm_scope("gfx_update") {
gfx_update(); gfx_update();
} }

View file

@ -47,11 +47,11 @@ int entry(int argc, char **argv) {
// ... So we have to justify that bottom_left according to text metrics // ... So we have to justify that bottom_left according to text metrics
Vector2 justified = v2_sub(bottom_left, hello_metrics.functional_pos_min); Vector2 justified = v2_sub(bottom_left, hello_metrics.functional_pos_min);
// If we wanted to center it:
// justified = v2_sub(justified, v2_divf(hello_metrics.functional_size, 2));
draw_text(font, hello_str, font_height, justified, v2(1, 1), COLOR_WHITE); draw_text(font, hello_str, font_height, justified, v2(1, 1), COLOR_WHITE);
// If for example we wanted to center the text, we would do the same but then add
// the text size divided by two:
// justified = v2_add(justified, v2_divf(hello_metrics.functional_size, 2.0));
local_persist bool show_bounds = false; local_persist bool show_bounds = false;
if (is_key_just_pressed('E')) show_bounds = !show_bounds; if (is_key_just_pressed('E')) show_bounds = !show_bounds;

View file

@ -19,14 +19,43 @@ TODO:
typedef struct Gfx_Font Gfx_Font; typedef struct Gfx_Font Gfx_Font;
typedef struct Gfx_Text_Metrics { typedef struct Gfx_Text_Metrics {
// "functional" is what you would use for example for text editing for constistent
// placements. /*
// To draw text with it's origin at the bottom left, you need to sub this from the bottom
// left. I.e: FUNCTIONAL BOX:
// Vector2 justified_pos = v2_sub(bottom_left, metrics.functional_pos_min); x0: left start of text box
// If you want to center, you have to first justify to bottom left and then add half of x1: right end of text box
// metrics.functional_size (or visual_size if you want perfect alignment and the text y0: The baseline for the bottom line of text
// is static). y1: The baseline for the top line of text + latin_ascent
VISUAL BOX:
x0: The minimum X of any pixels
x1: The maximum X of any pixels
y0: The minimum Y of any pixels
y1: The maximum Y of any pixels
Usage:
For a single piece of static text, it might look better to use the visual box for aligning it somewhere.
I might be stupid and that might be useless tho.
Most of the time, you are probably going to want to use the functional box.
For example, to center:
Gfx_Text_Metrics m = measure_text(...);
// This is the point where the center of the text box will be
Vector2 draw_pos = v2(...);
// First justify for the bottom-left to be at the draw point
Vector2 justified = v2_sub(draw_pos, m.functional_pos_min);
// Then move text backwards by functional_size/2 to align its center to the draw point
justified = v2_sub(justified, v2_divf(m.functional_size, 2));
*/
Vector2 functional_pos_min; Vector2 functional_pos_min;
Vector2 functional_pos_max; Vector2 functional_pos_max;
Vector2 functional_size; Vector2 functional_size;
@ -157,12 +186,12 @@ void font_variation_init(Gfx_Font_Variation *variation, Gfx_Font *font, u32 font
variation->metrics.latin_descent = c_descent; variation->metrics.latin_descent = c_descent;
} }
} }
for (u32 c = 'A'; c <= 'Z'; c++) { for (u32 c = 'A'; c <= 'Z'; c++) {
// This one is bottom-top as opposed to normally in stbtt where it's top-bottom // This one is bottom-top as opposed to normally in stbtt where it's top-bottom
int x0, y0, x1, y1; int x0, y0, x1, y1;
stbtt_GetCodepointBitmapBox(&font->stbtt_handle, (int)c, variation->scale, variation->scale, &x0, &y0, &x1, &y1); stbtt_GetCodepointBitmapBox(&font->stbtt_handle, (int)c, variation->scale, variation->scale, &x0, &y0, &x1, &y1);
float c_ascent = (float)(y1-y0); // #Bugprone #Cleanup I am not at all sure about this! float c_ascent = (float)abs(y0);
if (c_ascent > variation->metrics.latin_ascent) if (c_ascent > variation->metrics.latin_ascent)
variation->metrics.latin_ascent = c_ascent; variation->metrics.latin_ascent = c_ascent;
} }
@ -325,7 +354,8 @@ Gfx_Font_Metrics get_font_metrics_scaled(Gfx_Font *font, u32 raster_height, Vect
typedef struct { typedef struct {
Gfx_Text_Metrics m; Gfx_Text_Metrics m;
Gfx_Font *font;
u32 raster_height;
Vector2 scale; Vector2 scale;
} Measure_Text_Walk_Glyphs_Context; } Measure_Text_Walk_Glyphs_Context;
@ -333,10 +363,12 @@ bool measure_text_glyph_callback(Gfx_Glyph glyph, Gfx_Font_Atlas *atlas, float g
Measure_Text_Walk_Glyphs_Context *c = (Measure_Text_Walk_Glyphs_Context*)ud; Measure_Text_Walk_Glyphs_Context *c = (Measure_Text_Walk_Glyphs_Context*)ud;
Gfx_Font_Metrics m = get_font_metrics_scaled(c->font, c->raster_height, c->scale);
float functional_left = glyph_x-glyph.xoffset*c->scale.x; float functional_left = glyph_x-glyph.xoffset*c->scale.x;
float functional_bottom = glyph_y-glyph.yoffset*c->scale.y; float functional_bottom = glyph_y-glyph.yoffset*c->scale.y; // baseline
float functional_right = functional_left + glyph.width*c->scale.x; float functional_right = functional_left + (glyph.width+glyph.xoffset)*c->scale.x;
float functional_top = functional_bottom + glyph.height*c->scale.y; float functional_top = functional_bottom + (m.latin_ascent+glyph.yoffset)*c->scale.y;
c->m.functional_pos_min.x = min(c->m.functional_pos_min.x, functional_left); c->m.functional_pos_min.x = min(c->m.functional_pos_min.x, functional_left);
c->m.functional_pos_min.y = min(c->m.functional_pos_min.y, functional_bottom); c->m.functional_pos_min.y = min(c->m.functional_pos_min.y, functional_bottom);
@ -359,6 +391,8 @@ Gfx_Text_Metrics measure_text(Gfx_Font *font, string text, u32 raster_height, Ve
Measure_Text_Walk_Glyphs_Context c = ZERO(Measure_Text_Walk_Glyphs_Context); Measure_Text_Walk_Glyphs_Context c = ZERO(Measure_Text_Walk_Glyphs_Context);
c.scale = scale; c.scale = scale;
c.font = font;
c.raster_height = raster_height;
walk_glyphs((Walk_Glyphs_Spec){font, text, raster_height, scale, true, &c}, measure_text_glyph_callback); walk_glyphs((Walk_Glyphs_Spec){font, text, raster_height, scale, true, &c}, measure_text_glyph_callback);

View file

@ -1,33 +1,29 @@
#if !OOGABOOGA_DEV
#include "d3d11_image_shader_bytecode.c"
#endif
// #Cleanup apparently there are C macros for these (COBJMACROS)
#define D3D11Release(x) x->lpVtbl->Release(x) #define D3D11Release(x) x->lpVtbl->Release(x)
#define VTABLE(proc, ...) FIRST_ARG(__VA_ARGS__)->lpVtbl->proc(__VA_ARGS__)
const Gfx_Handle GFX_INVALID_HANDLE = 0; const Gfx_Handle GFX_INVALID_HANDLE = 0;
string temp_win32_null_terminated_wide_to_fixed_utf8(const u16 *utf16); string temp_win32_null_terminated_wide_to_fixed_utf8(const u16 *utf16);
// We wanna pack this at some point
// #Cleanup #Memory why am I doing alignat(16)?
typedef struct alignat(16) D3D11_Vertex { typedef struct alignat(16) D3D11_Vertex {
Vector4 color; Vector4 color;
Vector4 position; Vector4 position;
Vector2 uv; Vector2 uv;
union { Vector2 self_uv;
s32 data1;
struct {
s8 texture_index; s8 texture_index;
u8 type; u8 type;
u8 sampler; u8 sampler;
u8 padding; u8 has_scissor;
};
}; Vector4 userdata[VERTEX_2D_USER_DATA_COUNT];
Vector4 scissor;
} D3D11_Vertex; } D3D11_Vertex;
ID3D11Debug *d3d11_debug = 0; ID3D11Debug *d3d11_debug = 0;
@ -51,17 +47,23 @@ ID3D11SamplerState *d3d11_image_sampler_nl_fl = 0;
ID3D11SamplerState *d3d11_image_sampler_np_fl = 0; ID3D11SamplerState *d3d11_image_sampler_np_fl = 0;
ID3D11SamplerState *d3d11_image_sampler_nl_fp = 0; ID3D11SamplerState *d3d11_image_sampler_nl_fp = 0;
ID3D11VertexShader *d3d11_image_vertex_shader = 0; ID3D11VertexShader *d3d11_vertex_shader_for_2d = 0;
ID3D11PixelShader *d3d11_image_pixel_shader = 0; ID3D11PixelShader *d3d11_fragment_shader_for_2d = 0;
ID3D11InputLayout *d3d11_image_vertex_layout = 0; ID3D11InputLayout *d3d11_image_vertex_layout = 0;
ID3D11Buffer *d3d11_quad_vbo = 0; ID3D11Buffer *d3d11_quad_vbo = 0;
u32 d3d11_quad_vbo_size = 0; u32 d3d11_quad_vbo_size = 0;
void *d3d11_staging_quad_buffer = 0; void *d3d11_staging_quad_buffer = 0;
ID3D11Buffer *d3d11_cbuffer = 0;
u64 d3d11_cbuffer_size = 0;
Draw_Quad *sort_quad_buffer = 0; Draw_Quad *sort_quad_buffer = 0;
u64 sort_quad_buffer_size = 0; u64 sort_quad_buffer_size = 0;
// Defined at the bottom of this file
extern const char *d3d11_image_shader_source;
const char* d3d11_stringify_category(D3D11_MESSAGE_CATEGORY category) { const char* d3d11_stringify_category(D3D11_MESSAGE_CATEGORY category) {
switch (category) { switch (category) {
case D3D11_MESSAGE_CATEGORY_APPLICATION_DEFINED: return "Application Defined"; case D3D11_MESSAGE_CATEGORY_APPLICATION_DEFINED: return "Application Defined";
@ -91,6 +93,14 @@ const char* d3d11_stringify_severity(D3D11_MESSAGE_SEVERITY severity) {
void CALLBACK d3d11_debug_callback(D3D11_MESSAGE_CATEGORY category, D3D11_MESSAGE_SEVERITY severity, D3D11_MESSAGE_ID id, const char* description) void CALLBACK d3d11_debug_callback(D3D11_MESSAGE_CATEGORY category, D3D11_MESSAGE_SEVERITY severity, D3D11_MESSAGE_ID id, const char* description)
{ {
if (id == 391) {
// Sigh:
/*
[WARNING]: D3D11 MESSAGE [Category: State Creation, Severity: Warning, id: 391]: ID3D11Device::CreateInputLayout: The provided input signature expects to read an element with SemanticName/Index: 'SAMPLER_INDEX'/0 and component(s) of the type 'uint32'. However, the matching entry in the Input Layout declaration, element[5], specifies mismatched format: 'R8_SINT'. This is not an error, since behavior is well defined: The element format determines what data conversion algorithm gets applied before it shows up in a shader register. Independently, the shader input signature defines how the shader will interpret the data that has been placed in its input registers, with no change in the bits stored. It is valid for the application to reinterpret data as a different type once it is in the vertex shader, so this warning is issued just in case reinterpretation was not intended by the author.
*/
return;
}
string msg = tprint("D3D11 MESSAGE [Category: %cs, Severity: %cs, id: %d]: %cs", d3d11_stringify_category(category), d3d11_stringify_severity(severity), id, description); string msg = tprint("D3D11 MESSAGE [Category: %cs, Severity: %cs, id: %d]: %cs", d3d11_stringify_category(category), d3d11_stringify_severity(severity), id, description);
switch (severity) { switch (severity) {
@ -184,14 +194,14 @@ void d3d11_update_swapchain() {
win32_check_hr(hr); win32_check_hr(hr);
IDXGIAdapter *adapter; IDXGIAdapter *adapter;
hr = VTABLE(GetAdapter, dxgi_device, &adapter); hr = IDXGIDevice_GetAdapter(dxgi_device, &adapter);
win32_check_hr(hr); win32_check_hr(hr);
IDXGIFactory2 *dxgi_factory; IDXGIFactory2 *dxgi_factory;
hr = VTABLE(GetParent, adapter, &IID_IDXGIFactory2, cast(void**)&dxgi_factory); hr = IDXGIAdapter_GetParent(adapter, &IID_IDXGIFactory2, cast(void**)&dxgi_factory);
win32_check_hr(hr); win32_check_hr(hr);
hr = VTABLE(CreateSwapChainForHwnd, dxgi_factory, (IUnknown*)d3d11_device, window._os_handle, &scd, 0, 0, &d3d11_swap_chain); hr = IDXGIFactory2_CreateSwapChainForHwnd(dxgi_factory, (IUnknown*)d3d11_device, window._os_handle, &scd, 0, 0, &d3d11_swap_chain);
win32_check_hr(hr); win32_check_hr(hr);
RECT client_rect; RECT client_rect;
@ -202,15 +212,15 @@ void d3d11_update_swapchain() {
d3d11_swap_chain_height = client_rect.bottom-client_rect.top; d3d11_swap_chain_height = client_rect.bottom-client_rect.top;
// store the swap chain description, as created by CreateSwapChainForHwnd // store the swap chain description, as created by CreateSwapChainForHwnd
hr = VTABLE(GetDesc1, d3d11_swap_chain, &d3d11_swap_chain_desc); hr = IDXGISwapChain1_GetDesc1(d3d11_swap_chain, &d3d11_swap_chain_desc);
win32_check_hr(hr); win32_check_hr(hr);
// disable alt enter // disable alt enter
VTABLE(MakeWindowAssociation, dxgi_factory, window._os_handle, cast (u32) DXGI_MWA_NO_ALT_ENTER); IDXGIFactory_MakeWindowAssociation(dxgi_factory, window._os_handle, cast (u32) DXGI_MWA_NO_ALT_ENTER);
D3D11Release(dxgi_device); IDXGIDevice_Release(dxgi_device);
D3D11Release(adapter); IDXGIAdapter_Release(adapter);
D3D11Release(dxgi_factory); IDXGIFactory_Release(dxgi_factory);
log("Created swap chain of size %dx%d", d3d11_swap_chain_width, d3d11_swap_chain_height); log("Created swap chain of size %dx%d", d3d11_swap_chain_width, d3d11_swap_chain_height);
} else { } else {
@ -224,11 +234,11 @@ void d3d11_update_swapchain() {
u32 window_width = client_rect.right-client_rect.left; u32 window_width = client_rect.right-client_rect.left;
u32 window_height = client_rect.bottom-client_rect.top; u32 window_height = client_rect.bottom-client_rect.top;
hr = VTABLE(ResizeBuffers, d3d11_swap_chain, d3d11_swap_chain_desc.BufferCount, window_width, window_height, d3d11_swap_chain_desc.Format, d3d11_swap_chain_desc.Flags); hr = IDXGISwapChain1_ResizeBuffers(d3d11_swap_chain, d3d11_swap_chain_desc.BufferCount, window_width, window_height, d3d11_swap_chain_desc.Format, d3d11_swap_chain_desc.Flags);
win32_check_hr(hr); win32_check_hr(hr);
// update swap chain description // update swap chain description
hr = VTABLE(GetDesc1, d3d11_swap_chain, &d3d11_swap_chain_desc); hr = IDXGISwapChain1_GetDesc1(d3d11_swap_chain, &d3d11_swap_chain_desc);
win32_check_hr(hr); win32_check_hr(hr);
log("Resized swap chain from %dx%d to %dx%d", d3d11_swap_chain_width, d3d11_swap_chain_height, window_width, window_height); log("Resized swap chain from %dx%d to %dx%d", d3d11_swap_chain_width, d3d11_swap_chain_height, window_width, window_height);
@ -240,12 +250,156 @@ void d3d11_update_swapchain() {
hr = VTABLE(GetBuffer, d3d11_swap_chain, 0, &IID_ID3D11Texture2D, (void**)&d3d11_back_buffer); hr = IDXGISwapChain1_GetBuffer(d3d11_swap_chain, 0, &IID_ID3D11Texture2D, (void**)&d3d11_back_buffer);
win32_check_hr(hr); win32_check_hr(hr);
hr = VTABLE(CreateRenderTargetView, d3d11_device, (ID3D11Resource*)d3d11_back_buffer, 0, &d3d11_window_render_target_view); hr = ID3D11Device_CreateRenderTargetView(d3d11_device, (ID3D11Resource*)d3d11_back_buffer, 0, &d3d11_window_render_target_view);
win32_check_hr(hr); win32_check_hr(hr);
} }
bool
d3d11_compile_shader(string source) {
source = string_replace_all(source, STR("$INJECT_PIXEL_POST_PROCESS"), STR("float4 pixel_shader_extension(PS_INPUT input, float4 color) { return color; }"), temp);
source = string_replace_all(source, STR("$VERTEX_2D_USER_DATA_COUNT"), tprint("%d", VERTEX_2D_USER_DATA_COUNT), temp);
// #Leak on recompile
///
// Make default shaders
// Compile vertex shader
ID3DBlob* vs_blob = NULL;
ID3DBlob* err_blob = NULL;
HRESULT hr = D3DCompile((char*)source.data, source.count, 0, 0, 0, "vs_main", "vs_5_0", 0, 0, &vs_blob, &err_blob);
if (!SUCCEEDED(hr)) {
log_error("Vertex Shader Compilation Error: %cs\n", (char*)ID3D10Blob_GetBufferPointer(err_blob));
return false;
}
// Compile pixel shader
ID3DBlob* ps_blob = NULL;
hr = D3DCompile((char*)source.data, source.count, 0, 0, 0, "ps_main", "ps_5_0", 0, 0, &ps_blob, &err_blob);
if (!SUCCEEDED(hr)) {
log_error("Fragment Shader Compilation Error: %cs\n", (char*)ID3D10Blob_GetBufferPointer(err_blob));
return false;
}
void *vs_buffer = ID3D10Blob_GetBufferPointer(vs_blob);
u64 vs_size = ID3D10Blob_GetBufferSize(vs_blob);
void *ps_buffer = ID3D10Blob_GetBufferPointer(ps_blob);
u64 ps_size = ID3D10Blob_GetBufferSize(ps_blob);
log_verbose("Shaders compiled");
// Create the shaders
hr = ID3D11Device_CreateVertexShader(d3d11_device, vs_buffer, vs_size, NULL, &d3d11_vertex_shader_for_2d);
win32_check_hr(hr);
hr = ID3D11Device_CreatePixelShader(d3d11_device, ps_buffer, ps_size, NULL, &d3d11_fragment_shader_for_2d);
win32_check_hr(hr);
log_verbose("Shaders created");
#define layout_base_count 9
D3D11_INPUT_ELEMENT_DESC layout[layout_base_count+VERTEX_2D_USER_DATA_COUNT];
memset(layout, 0, sizeof(layout));
layout[0].SemanticName = "POSITION";
layout[0].SemanticIndex = 0;
layout[0].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
layout[0].InputSlot = 0;
layout[0].AlignedByteOffset = offsetof(D3D11_Vertex, position);
layout[0].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[0].InstanceDataStepRate = 0;
layout[1].SemanticName = "TEXCOORD";
layout[1].SemanticIndex = 0;
layout[1].Format = DXGI_FORMAT_R32G32_FLOAT;
layout[1].InputSlot = 0;
layout[1].AlignedByteOffset = offsetof(D3D11_Vertex, uv);
layout[1].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[1].InstanceDataStepRate = 0;
layout[2].SemanticName = "COLOR";
layout[2].SemanticIndex = 0;
layout[2].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
layout[2].InputSlot = 0;
layout[2].AlignedByteOffset = offsetof(D3D11_Vertex, color);
layout[2].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[2].InstanceDataStepRate = 0;
layout[3].SemanticName = "TEXTURE_INDEX";
layout[3].SemanticIndex = 0;
layout[3].Format = DXGI_FORMAT_R8_SINT;
layout[3].InputSlot = 0;
layout[3].AlignedByteOffset = offsetof(D3D11_Vertex, texture_index);
layout[3].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[3].InstanceDataStepRate = 0;
layout[4].SemanticName = "TYPE";
layout[4].SemanticIndex = 0;
layout[4].Format = DXGI_FORMAT_R8_UINT;
layout[4].InputSlot = 0;
layout[4].AlignedByteOffset = offsetof(D3D11_Vertex, type);
layout[4].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[4].InstanceDataStepRate = 0;
layout[5].SemanticName = "SAMPLER_INDEX";
layout[5].SemanticIndex = 0;
layout[5].Format = DXGI_FORMAT_R8_SINT;
layout[5].InputSlot = 0;
layout[5].AlignedByteOffset = offsetof(D3D11_Vertex, sampler);
layout[5].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[5].InstanceDataStepRate = 0;
layout[6].SemanticName = "SELF_UV";
layout[6].SemanticIndex = 0;
layout[6].Format = DXGI_FORMAT_R32G32_FLOAT;
layout[6].InputSlot = 0;
layout[6].AlignedByteOffset = offsetof(D3D11_Vertex, self_uv);
layout[6].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[6].InstanceDataStepRate = 0;
layout[7].SemanticName = "SCISSOR";
layout[7].SemanticIndex = 0;
layout[7].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
layout[7].InputSlot = 0;
layout[7].AlignedByteOffset = offsetof(D3D11_Vertex, scissor);
layout[7].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[7].InstanceDataStepRate = 0;
layout[8].SemanticName = "HAS_SCISSOR";
layout[8].SemanticIndex = 0;
layout[8].Format = DXGI_FORMAT_R8_UINT;
layout[8].InputSlot = 0;
layout[8].AlignedByteOffset = offsetof(D3D11_Vertex, has_scissor);
layout[8].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[8].InstanceDataStepRate = 0;
for (int i = 0; i < VERTEX_2D_USER_DATA_COUNT; ++i) {
layout[layout_base_count + i].SemanticName = "USERDATA";
layout[layout_base_count + i].SemanticIndex = i;
layout[layout_base_count + i].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
layout[layout_base_count + i].InputSlot = 0;
layout[layout_base_count + i].AlignedByteOffset = offsetof(D3D11_Vertex, userdata) + sizeof(Vector4) * i;
layout[layout_base_count + i].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
}
hr = ID3D11Device_CreateInputLayout(d3d11_device, layout, layout_base_count+VERTEX_2D_USER_DATA_COUNT, vs_buffer, vs_size, &d3d11_image_vertex_layout);
win32_check_hr(hr);
#undef layout_base_count
D3D11Release(vs_blob);
D3D11Release(ps_blob);
return true;
}
void gfx_init() { void gfx_init() {
window.enable_vsync = false; window.enable_vsync = false;
@ -304,13 +458,13 @@ void gfx_init() {
win32_check_hr(hr); win32_check_hr(hr);
if (debug_failed) { if (debug_failed) {
log_error("We could not init D3D11 with DEBUG flag. This is likely because you have not enabled \"Graphics Tools\" in windows settings. https://github.com/microsoft/DirectX-Graphics-Samples/issues/447#issuecomment-415611443"); log_error("We could not init D3D11 with DEBUG flag. To fix this, you can try:\n1. Go to windows settings\n2. Go to System -> Optional features\n3. Add the feature called \"Graphics Tools\"\n4. Restart your computer\n5. Be frustrated that windows is like this.\nhttps://devblogs.microsoft.com/cppblog/visual-studio-2015-and-graphics-tools-for-windows-10/");
} }
assert(d3d11_device != 0, "D3D11CreateDevice failed"); assert(d3d11_device != 0, "D3D11CreateDevice failed");
#if CONFIGURATION == DEBUG #if CONFIGURATION == DEBUG
hr = VTABLE(QueryInterface, d3d11_device, &IID_ID3D11Debug, (void**)&d3d11_debug); hr = ID3D11Device_QueryInterface(d3d11_device, &IID_ID3D11Debug, (void**)&d3d11_debug);
if (SUCCEEDED(hr)) { if (SUCCEEDED(hr)) {
log_verbose("D3D11 debug is active"); log_verbose("D3D11 debug is active");
} }
@ -320,13 +474,13 @@ void gfx_init() {
IDXGIDevice *dxgi_device = 0; IDXGIDevice *dxgi_device = 0;
IDXGIAdapter *target_adapter = 0; IDXGIAdapter *target_adapter = 0;
hr = VTABLE(QueryInterface, d3d11_device, &IID_IDXGIDevice, (void **)&dxgi_device); hr = ID3D11Device_QueryInterface(d3d11_device, &IID_IDXGIDevice, (void **)&dxgi_device);
hr = VTABLE(GetAdapter, dxgi_device, &target_adapter); hr = IDXGIDevice_GetAdapter(dxgi_device, &target_adapter);
if (SUCCEEDED(hr)) { if (SUCCEEDED(hr)) {
DXGI_ADAPTER_DESC adapter_desc = ZERO(DXGI_ADAPTER_DESC); DXGI_ADAPTER_DESC adapter_desc = ZERO(DXGI_ADAPTER_DESC);
hr = VTABLE(GetDesc, target_adapter, &adapter_desc); hr = IDXGIAdapter_GetDesc(target_adapter, &adapter_desc);
if (SUCCEEDED(hr)) { if (SUCCEEDED(hr)) {
string desc = temp_win32_null_terminated_wide_to_fixed_utf8(adapter_desc.Description); string desc = temp_win32_null_terminated_wide_to_fixed_utf8(adapter_desc.Description);
log("D3D11 adapter is: %s", desc); log("D3D11 adapter is: %s", desc);
@ -349,9 +503,9 @@ void gfx_init() {
bd.RenderTarget[0].DestBlendAlpha = D3D11_BLEND_ZERO; bd.RenderTarget[0].DestBlendAlpha = D3D11_BLEND_ZERO;
bd.RenderTarget[0].BlendOpAlpha = D3D11_BLEND_OP_ADD; bd.RenderTarget[0].BlendOpAlpha = D3D11_BLEND_OP_ADD;
bd.RenderTarget[0].RenderTargetWriteMask = D3D11_COLOR_WRITE_ENABLE_ALL; bd.RenderTarget[0].RenderTargetWriteMask = D3D11_COLOR_WRITE_ENABLE_ALL;
hr = VTABLE(CreateBlendState, d3d11_device, &bd, &d3d11_blend_state); hr = ID3D11Device_CreateBlendState(d3d11_device, &bd, &d3d11_blend_state);
win32_check_hr(hr); win32_check_hr(hr);
VTABLE(OMSetBlendState, d3d11_context, d3d11_blend_state, NULL, 0xffffffff); ID3D11DeviceContext_OMSetBlendState(d3d11_context, d3d11_blend_state, NULL, 0xffffffff);
} }
{ {
@ -361,30 +515,11 @@ void gfx_init() {
desc.FrontCounterClockwise = FALSE; desc.FrontCounterClockwise = FALSE;
desc.DepthClipEnable = FALSE; desc.DepthClipEnable = FALSE;
desc.CullMode = D3D11_CULL_NONE; desc.CullMode = D3D11_CULL_NONE;
hr = VTABLE(CreateRasterizerState, d3d11_device, &desc, &d3d11_rasterizer); hr = ID3D11Device_CreateRasterizerState(d3d11_device, &desc, &d3d11_rasterizer);
win32_check_hr(hr); win32_check_hr(hr);
VTABLE(RSSetState, d3d11_context, d3d11_rasterizer); ID3D11DeviceContext_RSSetState(d3d11_context, d3d11_rasterizer);
} }
// COnst buffer
/*{
D3D11_BUFFER_DESC bd;
bd.ByteWidth = align_forward(sizeof(GlobalConstBuffer), 16);
bd.Usage = D3D11_USAGE_DYNAMIC;
bd.BindFlags = D3D11_BIND_CONSTANT_BUFFER;
bd.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
ID3D11Device_CreateBuffer(dx_state.d3d_device, &bd, NULL, &dx_state.const_buffer_resource);
}*/
/*{
D3D11_BUFFER_DESC bd;
bd.ByteWidth = align_forward(sizeof(BatchUniforms), 16);
bd.Usage = D3D11_USAGE_DYNAMIC;
bd.BindFlags = D3D11_BIND_CONSTANT_BUFFER;
bd.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
ID3D11Device_CreateBuffer(dx_state.d3d_device, &bd, NULL, &render_st.batch.ubo);
}*/
{ {
D3D11_SAMPLER_DESC sd = ZERO(D3D11_SAMPLER_DESC); D3D11_SAMPLER_DESC sd = ZERO(D3D11_SAMPLER_DESC);
sd.Filter = D3D11_FILTER_MIN_MAG_MIP_POINT; sd.Filter = D3D11_FILTER_MIN_MAG_MIP_POINT;
@ -394,175 +529,82 @@ void gfx_init() {
sd.ComparisonFunc = D3D11_COMPARISON_NEVER; sd.ComparisonFunc = D3D11_COMPARISON_NEVER;
sd.Filter = D3D11_FILTER_MIN_MAG_MIP_POINT; sd.Filter = D3D11_FILTER_MIN_MAG_MIP_POINT;
hr = VTABLE(CreateSamplerState, d3d11_device, &sd, &d3d11_image_sampler_np_fp); hr = ID3D11Device_CreateSamplerState(d3d11_device, &sd, &d3d11_image_sampler_np_fp);
win32_check_hr(hr); win32_check_hr(hr);
sd.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR; sd.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR;
hr = VTABLE(CreateSamplerState, d3d11_device, &sd, &d3d11_image_sampler_nl_fl); hr =ID3D11Device_CreateSamplerState(d3d11_device, &sd, &d3d11_image_sampler_nl_fl);
win32_check_hr(hr); win32_check_hr(hr);
sd.Filter = D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT; sd.Filter = D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT;
hr = VTABLE(CreateSamplerState, d3d11_device, &sd, &d3d11_image_sampler_np_fl); hr = ID3D11Device_CreateSamplerState(d3d11_device, &sd, &d3d11_image_sampler_np_fl);
win32_check_hr(hr); win32_check_hr(hr);
sd.Filter = D3D11_FILTER_MIN_POINT_MAG_MIP_LINEAR; sd.Filter = D3D11_FILTER_MIN_POINT_MAG_MIP_LINEAR;
hr = VTABLE(CreateSamplerState, d3d11_device, &sd, &d3d11_image_sampler_nl_fp); hr = ID3D11Device_CreateSamplerState(d3d11_device, &sd, &d3d11_image_sampler_nl_fp);
win32_check_hr(hr); win32_check_hr(hr);
} }
// We are ooga booga devs so we read the file and compile string source = STR(d3d11_image_shader_source);
#if OOGABOOGA_DEV
string source; bool ok = d3d11_compile_shader(source);
bool source_ok = os_read_entire_file("oogabooga/dev/d3d11_image_shader.hlsl", &source, get_heap_allocator()); // #Leak
assert(source_ok, "Could not open d3d11_image_shader source");
// Compile vertex shader assert(ok, "Failed compiling default shader");
ID3DBlob* vs_blob = NULL;
ID3DBlob* err_blob = NULL;
hr = D3DCompile((char*)source.data, source.count, 0, 0, 0, "vs_main", "vs_5_0", 0, 0, &vs_blob, &err_blob);
assert(SUCCEEDED(hr), "Vertex Shader Compilation Error: %cs\n", (char*)VTABLE(GetBufferPointer, err_blob));
// Compile pixel shader
ID3DBlob* ps_blob = NULL;
hr = D3DCompile((char*)source.data, source.count, 0, 0, 0, "ps_main", "ps_5_0", 0, 0, &ps_blob, &err_blob);
assert(SUCCEEDED(hr), "Vertex Shader Compilation Error: %cs\n", (char*)VTABLE(GetBufferPointer, err_blob));
void *vs_buffer = VTABLE(GetBufferPointer, vs_blob);
u64 vs_size = VTABLE(GetBufferSize, vs_blob);
void *ps_buffer = VTABLE(GetBufferPointer, ps_blob);
u64 ps_size = VTABLE(GetBufferSize, ps_blob);
log_verbose("Shaders compiled");
///
// Dump blobs to the .c
File blob_file = os_file_open("oogabooga/d3d11_image_shader_bytecode.c", O_WRITE | O_CREATE);
os_file_write_string(blob_file, STR("/*\n"));
os_file_write_string(blob_file, STR("<<<<<< Bytecode compiled fro HLSL code below: >>>>>>\n\n"));
os_file_write_string(blob_file, source);
os_file_write_string(blob_file, STR("\n*/\n\n"));
os_file_write_string(blob_file, STR("const u8 IMAGE_SHADER_VERTEX_BLOB_BYTES[]= {\n"));
for (u64 i = 0; i < vs_size; i++) {
os_file_write_string(blob_file, tprint("0x%02x", (int)((u8*)vs_buffer)[i]));
if (i < vs_size-1) os_file_write_string(blob_file, STR(", "));
if (i % 15 == 0 && i != 0) os_file_write_string(blob_file, STR("\n"));
}
os_file_write_string(blob_file, STR("\n};\n"));
os_file_write_string(blob_file, STR("const u8 IMAGE_SHADER_PIXEL_BLOB_BYTES[]= {\n"));
for (u64 i = 0; i < ps_size; i++) {
os_file_write_string(blob_file, tprint("0x%02x", (int)((u8*)ps_buffer)[i]));
if (i < ps_size-1) os_file_write_string(blob_file, STR(", "));
if (i % 15 == 0 && i != 0) os_file_write_string(blob_file, STR("\n"));
}
os_file_write_string(blob_file, STR("\n};\n"));
os_file_close(blob_file);
#else
const void *vs_buffer = IMAGE_SHADER_VERTEX_BLOB_BYTES;
u64 vs_size = sizeof(IMAGE_SHADER_VERTEX_BLOB_BYTES);
const void *ps_buffer = IMAGE_SHADER_PIXEL_BLOB_BYTES;
u64 ps_size = sizeof(IMAGE_SHADER_PIXEL_BLOB_BYTES);
log_verbose("Cached shaders loaded");
#endif
// Create the shaders
hr = VTABLE(CreateVertexShader, d3d11_device, vs_buffer, vs_size, NULL, &d3d11_image_vertex_shader);
win32_check_hr(hr);
hr = VTABLE(CreatePixelShader, d3d11_device, ps_buffer, ps_size, NULL, &d3d11_image_pixel_shader);
win32_check_hr(hr);
log_verbose("Shaders created");
D3D11_INPUT_ELEMENT_DESC layout[4];
memset(layout, 0, sizeof(layout));
layout[0].SemanticName = "POSITION";
layout[0].SemanticIndex = 0;
layout[0].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
layout[0].InputSlot = 0;
layout[0].AlignedByteOffset = offsetof(D3D11_Vertex, position);
layout[0].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[0].InstanceDataStepRate = 0;
layout[1].SemanticName = "TEXCOORD";
layout[1].SemanticIndex = 0;
layout[1].Format = DXGI_FORMAT_R32G32_FLOAT;
layout[1].InputSlot = 0;
layout[1].AlignedByteOffset = offsetof(D3D11_Vertex, uv);
layout[1].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[1].InstanceDataStepRate = 0;
layout[2].SemanticName = "COLOR";
layout[2].SemanticIndex = 0;
layout[2].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
layout[2].InputSlot = 0;
layout[2].AlignedByteOffset = offsetof(D3D11_Vertex, color);
layout[2].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[2].InstanceDataStepRate = 0;
layout[3].SemanticName = "DATA1_";
layout[3].SemanticIndex = 0;
layout[3].Format = DXGI_FORMAT_R32_SINT;
layout[3].InputSlot = 0;
layout[3].AlignedByteOffset = offsetof(D3D11_Vertex, data1);
layout[3].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
layout[3].InstanceDataStepRate = 0;
hr = VTABLE(CreateInputLayout, d3d11_device, layout, 4, vs_buffer, vs_size, &d3d11_image_vertex_layout);
win32_check_hr(hr);
#if OOGABOOGA_DEV
D3D11Release(vs_blob);
D3D11Release(ps_blob);
#endif
log_info("D3D11 init done"); log_info("D3D11 init done");
} }
void d3d11_draw_call(int number_of_rendered_quads, ID3D11ShaderResourceView **textures, u64 num_textures) { void d3d11_draw_call(int number_of_rendered_quads, ID3D11ShaderResourceView **textures, u64 num_textures) {
VTABLE(OMSetBlendState, d3d11_context, d3d11_blend_state, 0, 0xffffffff); ID3D11DeviceContext_OMSetBlendState(d3d11_context, d3d11_blend_state, 0, 0xffffffff);
VTABLE(OMSetRenderTargets, d3d11_context, 1, &d3d11_window_render_target_view, 0); ID3D11DeviceContext_OMSetRenderTargets(d3d11_context, 1, &d3d11_window_render_target_view, 0);
VTABLE(RSSetState, d3d11_context, d3d11_rasterizer); ID3D11DeviceContext_RSSetState(d3d11_context, d3d11_rasterizer);
D3D11_VIEWPORT viewport = ZERO(D3D11_VIEWPORT); D3D11_VIEWPORT viewport = ZERO(D3D11_VIEWPORT);
viewport.Width = d3d11_swap_chain_width; viewport.Width = d3d11_swap_chain_width;
viewport.Height = d3d11_swap_chain_height; viewport.Height = d3d11_swap_chain_height;
viewport.MaxDepth = 1.0; viewport.MaxDepth = 1.0;
VTABLE(RSSetViewports, d3d11_context, 1, &viewport); ID3D11DeviceContext_RSSetViewports(d3d11_context, 1, &viewport);
UINT stride = sizeof(D3D11_Vertex); UINT stride = sizeof(D3D11_Vertex);
UINT offset = 0; UINT offset = 0;
VTABLE(IASetInputLayout, d3d11_context, d3d11_image_vertex_layout); ID3D11DeviceContext_IASetInputLayout(d3d11_context, d3d11_image_vertex_layout);
VTABLE(IASetVertexBuffers, d3d11_context, 0, 1, &d3d11_quad_vbo, &stride, &offset); ID3D11DeviceContext_IASetVertexBuffers(d3d11_context, 0, 1, &d3d11_quad_vbo, &stride, &offset);
VTABLE(IASetPrimitiveTopology, d3d11_context, D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST); ID3D11DeviceContext_IASetPrimitiveTopology(d3d11_context, D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
VTABLE(VSSetShader, d3d11_context, d3d11_image_vertex_shader, NULL, 0); ID3D11DeviceContext_VSSetShader(d3d11_context, d3d11_vertex_shader_for_2d, NULL, 0);
VTABLE(PSSetShader, d3d11_context, d3d11_image_pixel_shader, NULL, 0); ID3D11DeviceContext_PSSetShader(d3d11_context, d3d11_fragment_shader_for_2d, NULL, 0);
VTABLE(PSSetSamplers, d3d11_context, 0, 1, &d3d11_image_sampler_np_fp); if (draw_frame.cbuffer && d3d11_cbuffer && d3d11_cbuffer_size) {
VTABLE(PSSetSamplers, d3d11_context, 1, 1, &d3d11_image_sampler_nl_fl); D3D11_MAPPED_SUBRESOURCE cbuffer_mapping;
VTABLE(PSSetSamplers, d3d11_context, 2, 1, &d3d11_image_sampler_np_fl); ID3D11DeviceContext_Map(
VTABLE(PSSetSamplers, d3d11_context, 3, 1, &d3d11_image_sampler_nl_fp); d3d11_context,
VTABLE(PSSetShaderResources, d3d11_context, 0, num_textures, textures); (ID3D11Resource*)d3d11_cbuffer,
0,
D3D11_MAP_WRITE_DISCARD,
0,
&cbuffer_mapping
);
memcpy(cbuffer_mapping.pData, draw_frame.cbuffer, d3d11_cbuffer_size);
ID3D11DeviceContext_Unmap(d3d11_context, (ID3D11Resource*)d3d11_cbuffer, 0);
VTABLE(Draw, d3d11_context, number_of_rendered_quads * 6, 0); ID3D11DeviceContext_PSSetConstantBuffers(d3d11_context, 0, 1, &d3d11_cbuffer);
}
ID3D11DeviceContext_PSSetSamplers(d3d11_context, 0, 1, &d3d11_image_sampler_np_fp);
ID3D11DeviceContext_PSSetSamplers(d3d11_context, 1, 1, &d3d11_image_sampler_nl_fl);
ID3D11DeviceContext_PSSetSamplers(d3d11_context, 2, 1, &d3d11_image_sampler_np_fl);
ID3D11DeviceContext_PSSetSamplers(d3d11_context, 3, 1, &d3d11_image_sampler_nl_fp);
ID3D11DeviceContext_PSSetShaderResources(d3d11_context, 0, num_textures, textures);
ID3D11DeviceContext_Draw(d3d11_context, number_of_rendered_quads * 6, 0);
} }
void d3d11_process_draw_frame() { void d3d11_process_draw_frame() {
HRESULT hr; HRESULT hr;
VTABLE(ClearRenderTargetView, d3d11_context, d3d11_window_render_target_view, (float*)&window.clear_color); ID3D11DeviceContext_ClearRenderTargetView(d3d11_context, d3d11_window_render_target_view, (float*)&window.clear_color);
/// ///
// Maybe grow quad vbo // Maybe grow quad vbo
@ -578,7 +620,7 @@ void d3d11_process_draw_frame() {
desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE; desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
desc.ByteWidth = required_size; desc.ByteWidth = required_size;
desc.BindFlags = D3D11_BIND_VERTEX_BUFFER; desc.BindFlags = D3D11_BIND_VERTEX_BUFFER;
HRESULT hr = VTABLE(CreateBuffer, d3d11_device, &desc, 0, &d3d11_quad_vbo); HRESULT hr = ID3D11Device_CreateBuffer(d3d11_device, &desc, 0, &d3d11_quad_vbo);
assert(SUCCEEDED(hr), "CreateBuffer failed"); assert(SUCCEEDED(hr), "CreateBuffer failed");
d3d11_quad_vbo_size = required_size; d3d11_quad_vbo_size = required_size;
@ -639,9 +681,9 @@ void d3d11_process_draw_frame() {
if (num_textures >= 32) { if (num_textures >= 32) {
// If max textures reached, make a draw call and start over // If max textures reached, make a draw call and start over
D3D11_MAPPED_SUBRESOURCE buffer_mapping; D3D11_MAPPED_SUBRESOURCE buffer_mapping;
VTABLE(Map, d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0, D3D11_MAP_WRITE_DISCARD, 0, &buffer_mapping); ID3D11DeviceContext_Map(d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0, D3D11_MAP_WRITE_DISCARD, 0, &buffer_mapping);
memcpy(buffer_mapping.pData, d3d11_staging_quad_buffer, number_of_rendered_quads*sizeof(D3D11_Vertex)*6); memcpy(buffer_mapping.pData, d3d11_staging_quad_buffer, number_of_rendered_quads*sizeof(D3D11_Vertex)*6);
VTABLE(Unmap, d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0); ID3D11DeviceContext_Unmap(d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0);
d3d11_draw_call(number_of_rendered_quads, textures, num_textures); d3d11_draw_call(number_of_rendered_quads, textures, num_textures);
head = (D3D11_Vertex*)d3d11_staging_quad_buffer; head = (D3D11_Vertex*)d3d11_staging_quad_buffer;
num_textures = 0; num_textures = 0;
@ -694,11 +736,32 @@ void d3d11_process_draw_frame() {
TR->uv = v2(q->uv.x2, q->uv.y2); TR->uv = v2(q->uv.x2, q->uv.y2);
BR->uv = v2(q->uv.x2, q->uv.y1); BR->uv = v2(q->uv.x2, q->uv.y1);
BL->self_uv = v2(0, 0);
TL->self_uv = v2(0, 1);
TR->self_uv = v2(1, 1);
BR->self_uv = v2(1, 0);
// #Speed
memcpy(BL->userdata, q->userdata, sizeof(q->userdata));
memcpy(TL->userdata, q->userdata, sizeof(q->userdata));
memcpy(TR->userdata, q->userdata, sizeof(q->userdata));
memcpy(BR->userdata, q->userdata, sizeof(q->userdata));
BL->color = TL->color = TR->color = BR->color = q->color; BL->color = TL->color = TR->color = BR->color = q->color;
BL->texture_index=TL->texture_index=TR->texture_index=BR->texture_index = texture_index; BL->texture_index=TL->texture_index=TR->texture_index=BR->texture_index = texture_index;
BL->type=TL->type=TR->type=BR->type = (u8)q->type; BL->type=TL->type=TR->type=BR->type = (u8)q->type;
float t = q->scissor.y1;
q->scissor.y1 = q->scissor.y2;
q->scissor.y2 = t;
q->scissor.y1 = window.pixel_height - q->scissor.y1;
q->scissor.y2 = window.pixel_height - q->scissor.y2;
BL->has_scissor=TL->has_scissor=TR->has_scissor=BR->has_scissor = q->has_scissor;
BL->scissor=TL->scissor=TR->scissor=BR->scissor = q->scissor;
u8 sampler = -1; u8 sampler = -1;
if (q->image_min_filter == GFX_FILTER_MODE_NEAREST if (q->image_min_filter == GFX_FILTER_MODE_NEAREST
&& q->image_mag_filter == GFX_FILTER_MODE_NEAREST) && q->image_mag_filter == GFX_FILTER_MODE_NEAREST)
@ -727,14 +790,14 @@ void d3d11_process_draw_frame() {
tm_scope("Write to gpu") { tm_scope("Write to gpu") {
D3D11_MAPPED_SUBRESOURCE buffer_mapping; D3D11_MAPPED_SUBRESOURCE buffer_mapping;
tm_scope("The Map call") { tm_scope("The Map call") {
hr = VTABLE(Map, d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0, D3D11_MAP_WRITE_DISCARD, 0, &buffer_mapping); hr = ID3D11DeviceContext_Map(d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0, D3D11_MAP_WRITE_DISCARD, 0, &buffer_mapping);
win32_check_hr(hr); win32_check_hr(hr);
} }
tm_scope("The memcpy") { tm_scope("The memcpy") {
memcpy(buffer_mapping.pData, d3d11_staging_quad_buffer, number_of_rendered_quads*sizeof(D3D11_Vertex)*6); memcpy(buffer_mapping.pData, d3d11_staging_quad_buffer, number_of_rendered_quads*sizeof(D3D11_Vertex)*6);
} }
tm_scope("The Unmap call") { tm_scope("The Unmap call") {
VTABLE(Unmap, d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0); ID3D11DeviceContext_Unmap(d3d11_context, (ID3D11Resource*)d3d11_quad_vbo, 0);
} }
} }
@ -766,7 +829,7 @@ void gfx_update() {
d3d11_process_draw_frame(); d3d11_process_draw_frame();
tm_scope("Present") { tm_scope("Present") {
VTABLE(Present, d3d11_swap_chain, window.enable_vsync, window.enable_vsync ? 0 : DXGI_PRESENT_ALLOW_TEARING); IDXGISwapChain1_Present(d3d11_swap_chain, window.enable_vsync, window.enable_vsync ? 0 : DXGI_PRESENT_ALLOW_TEARING);
} }
@ -774,16 +837,16 @@ void gfx_update() {
/// ///
// Check debug messages, output to stdout // Check debug messages, output to stdout
ID3D11InfoQueue* info_q = 0; ID3D11InfoQueue* info_q = 0;
hr = VTABLE(QueryInterface, d3d11_device, &IID_ID3D11InfoQueue, (void**)&info_q); hr = ID3D11Device_QueryInterface(d3d11_device, &IID_ID3D11InfoQueue, (void**)&info_q);
if (SUCCEEDED(hr)) { if (SUCCEEDED(hr)) {
u64 msg_count = VTABLE(GetNumStoredMessagesAllowedByRetrievalFilter, info_q); u64 msg_count = ID3D11InfoQueue_GetNumStoredMessagesAllowedByRetrievalFilter(info_q);
for (u64 i = 0; i < msg_count; i++) { for (u64 i = 0; i < msg_count; i++) {
SIZE_T msg_size = 0; SIZE_T msg_size = 0;
VTABLE(GetMessage, info_q, i, 0, &msg_size); ID3D11InfoQueue_GetMessage(info_q, i, 0, &msg_size);
D3D11_MESSAGE* msg = (D3D11_MESSAGE*)talloc(msg_size); D3D11_MESSAGE* msg = (D3D11_MESSAGE*)talloc(msg_size);
if (msg) { if (msg) {
VTABLE(GetMessage, info_q, i, msg, &msg_size); // Get the actual message ID3D11InfoQueue_GetMessage(info_q, i, msg, &msg_size); // Get the actual message
d3d11_debug_callback(msg->Category, msg->Severity, msg->ID, msg->pDescription); d3d11_debug_callback(msg->Category, msg->Severity, msg->ID, msg->pDescription);
} }
@ -828,10 +891,10 @@ void gfx_init_image(Gfx_Image *image, void *initial_data) {
data_desc.SysMemPitch = image->width * image->channels; data_desc.SysMemPitch = image->width * image->channels;
ID3D11Texture2D* texture = 0; ID3D11Texture2D* texture = 0;
HRESULT hr = VTABLE(CreateTexture2D, d3d11_device, &desc, &data_desc, &texture); HRESULT hr = ID3D11Device_CreateTexture2D(d3d11_device, &desc, &data_desc, &texture);
win32_check_hr(hr); win32_check_hr(hr);
hr = VTABLE(CreateShaderResourceView, d3d11_device, (ID3D11Resource*)texture, 0, &image->gfx_handle); hr = ID3D11Device_CreateShaderResourceView(d3d11_device, (ID3D11Resource*)texture, 0, &image->gfx_handle);
win32_check_hr(hr); win32_check_hr(hr);
if (!initial_data) { if (!initial_data) {
@ -845,14 +908,14 @@ void gfx_set_image_data(Gfx_Image *image, u32 x, u32 y, u32 w, u32 h, void *data
ID3D11ShaderResourceView *view = image->gfx_handle; ID3D11ShaderResourceView *view = image->gfx_handle;
ID3D11Resource *resource = NULL; ID3D11Resource *resource = NULL;
VTABLE(GetResource, view, &resource); ID3D11ShaderResourceView_GetResource(view, &resource);
assert(resource, "Invalid image passed to gfx_set_image_data"); assert(resource, "Invalid image passed to gfx_set_image_data");
assert(x+w <= image->width && y+h <= image->height, "Specified subregion in image is out of bounds"); assert(x+w <= image->width && y+h <= image->height, "Specified subregion in image is out of bounds");
ID3D11Texture2D *texture = NULL; ID3D11Texture2D *texture = NULL;
HRESULT hr = VTABLE(QueryInterface, resource, &IID_ID3D11Texture2D, (void**)&texture); HRESULT hr = ID3D11Resource_QueryInterface(resource, &IID_ID3D11Texture2D, (void**)&texture);
assert(SUCCEEDED(hr), "Expected gfx resource to be a texture but it wasn't"); assert(SUCCEEDED(hr), "Expected gfx resource to be a texture but it wasn't");
D3D11_BOX destBox; D3D11_BOX destBox;
@ -864,15 +927,15 @@ void gfx_set_image_data(Gfx_Image *image, u32 x, u32 y, u32 w, u32 h, void *data
destBox.back = 1; destBox.back = 1;
// #Incomplete bit-width 8 assumed // #Incomplete bit-width 8 assumed
VTABLE(UpdateSubresource, d3d11_context, (ID3D11Resource*)texture, 0, &destBox, data, w * image->channels, 0); ID3D11DeviceContext_UpdateSubresource(d3d11_context, (ID3D11Resource*)texture, 0, &destBox, data, w * image->channels, 0);
} }
void gfx_deinit_image(Gfx_Image *image) { void gfx_deinit_image(Gfx_Image *image) {
ID3D11ShaderResourceView *view = image->gfx_handle; ID3D11ShaderResourceView *view = image->gfx_handle;
ID3D11Resource *resource = 0; ID3D11Resource *resource = 0;
VTABLE(GetResource, view, &resource); ID3D11ShaderResourceView_GetResource(view, &resource);
ID3D11Texture2D *texture = 0; ID3D11Texture2D *texture = 0;
HRESULT hr = VTABLE(QueryInterface, resource, &IID_ID3D11Texture2D, (void**)&texture); HRESULT hr = ID3D11Resource_QueryInterface(resource, &IID_ID3D11Texture2D, (void**)&texture);
if (SUCCEEDED(hr)) { if (SUCCEEDED(hr)) {
D3D11Release(view); D3D11Release(view);
D3D11Release(texture); D3D11Release(texture);
@ -881,3 +944,279 @@ void gfx_deinit_image(Gfx_Image *image) {
panic("Unhandled D3D11 resource deletion"); panic("Unhandled D3D11 resource deletion");
} }
} }
bool
shader_recompile_with_extension(string ext_source, u64 cbuffer_size) {
string source = string_replace_all(STR(d3d11_image_shader_source), STR("$INJECT_PIXEL_POST_PROCESS"), ext_source, temp);
if (!d3d11_compile_shader(source)) return false;
u64 aligned_cbuffer_size = (max(cbuffer_size, 16) + 16) & ~(15);
if (d3d11_cbuffer) {
D3D11Release(d3d11_cbuffer);
}
D3D11_BUFFER_DESC desc = ZERO(D3D11_BUFFER_DESC);
desc.ByteWidth = aligned_cbuffer_size;
desc.Usage = D3D11_USAGE_DYNAMIC;
desc.BindFlags = D3D11_BIND_CONSTANT_BUFFER;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
HRESULT hr = ID3D11Device_CreateBuffer(d3d11_device, &desc, null, &d3d11_cbuffer);
win32_check_hr(hr);
d3d11_cbuffer_size = cbuffer_size;
return true;
}
const char *d3d11_image_shader_source = RAW_STRING(
struct VS_INPUT
{
float4 position : POSITION;
float2 uv : TEXCOORD;
float2 self_uv : SELF_UV;
float4 color : COLOR;
int texture_index : TEXTURE_INDEX;
uint type : TYPE;
uint sampler_index : SAMPLER_INDEX;
uint has_scissor : HAS_SCISSOR;
float4 userdata[$VERTEX_2D_USER_DATA_COUNT] : USERDATA;
float4 scissor : SCISSOR;
};
struct PS_INPUT
{
float4 position_screen : SV_POSITION;
float4 position : POSITION;
float2 uv : TEXCOORD0;
float2 self_uv : SELF_UV;
float4 color : COLOR;
int texture_index: TEXTURE_INDEX;
int type: TYPE;
int sampler_index: SAMPLER_INDEX;
uint has_scissor : HAS_SCISSOR;
float4 userdata[$VERTEX_2D_USER_DATA_COUNT] : USERDATA;
float4 scissor : SCISSOR;
};
PS_INPUT vs_main(VS_INPUT input)
{
PS_INPUT output;
output.position_screen = input.position;
output.position = input.position;
output.uv = input.uv;
output.color = input.color;
output.texture_index = input.texture_index;
output.type = input.type;
output.sampler_index = input.sampler_index;
output.self_uv = input.self_uv;
for (int i = 0; i < $VERTEX_2D_USER_DATA_COUNT; i++) {
output.userdata[i] = input.userdata[i];
}
output.scissor = input.scissor;
output.has_scissor = input.has_scissor;
return output;
}
// #Magicvalue
Texture2D textures[32] : register(t0);
SamplerState image_sampler_0 : register(s0);
SamplerState image_sampler_1 : register(s1);
SamplerState image_sampler_2 : register(s2);
SamplerState image_sampler_3 : register(s3);
float4 sample_texture(int texture_index, int sampler_index, float2 uv) {
// I love hlsl
if (sampler_index == 0) {
if (texture_index == 0) return textures[0].Sample(image_sampler_0, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_0, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_0, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_0, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_0, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_0, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_0, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_0, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_0, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_0, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_0, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_0, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_0, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_0, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_0, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_0, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_0, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_0, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_0, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_0, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_0, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_0, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_0, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_0, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_0, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_0, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_0, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_0, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_0, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_0, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_0, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_0, uv);
} else if (sampler_index == 1) {
if (texture_index == 0) return textures[0].Sample(image_sampler_1, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_1, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_1, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_1, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_1, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_1, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_1, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_1, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_1, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_1, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_1, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_1, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_1, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_1, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_1, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_1, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_1, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_1, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_1, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_1, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_1, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_1, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_1, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_1, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_1, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_1, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_1, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_1, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_1, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_1, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_1, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_1, uv);
} else if (sampler_index == 2) {
if (texture_index == 0) return textures[0].Sample(image_sampler_2, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_2, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_2, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_2, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_2, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_2, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_2, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_2, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_2, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_2, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_2, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_2, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_2, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_2, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_2, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_2, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_2, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_2, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_2, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_2, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_2, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_2, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_2, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_2, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_2, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_2, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_2, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_2, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_2, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_2, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_2, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_2, uv);
} else if (sampler_index == 3) {
if (texture_index == 0) return textures[0].Sample(image_sampler_3, uv);
else if (texture_index == 1) return textures[1].Sample(image_sampler_3, uv);
else if (texture_index == 2) return textures[2].Sample(image_sampler_3, uv);
else if (texture_index == 3) return textures[3].Sample(image_sampler_3, uv);
else if (texture_index == 4) return textures[4].Sample(image_sampler_3, uv);
else if (texture_index == 5) return textures[5].Sample(image_sampler_3, uv);
else if (texture_index == 6) return textures[6].Sample(image_sampler_3, uv);
else if (texture_index == 7) return textures[7].Sample(image_sampler_3, uv);
else if (texture_index == 8) return textures[8].Sample(image_sampler_3, uv);
else if (texture_index == 9) return textures[9].Sample(image_sampler_3, uv);
else if (texture_index == 10) return textures[10].Sample(image_sampler_3, uv);
else if (texture_index == 11) return textures[11].Sample(image_sampler_3, uv);
else if (texture_index == 12) return textures[12].Sample(image_sampler_3, uv);
else if (texture_index == 13) return textures[13].Sample(image_sampler_3, uv);
else if (texture_index == 14) return textures[14].Sample(image_sampler_3, uv);
else if (texture_index == 15) return textures[15].Sample(image_sampler_3, uv);
else if (texture_index == 16) return textures[16].Sample(image_sampler_3, uv);
else if (texture_index == 17) return textures[17].Sample(image_sampler_3, uv);
else if (texture_index == 18) return textures[18].Sample(image_sampler_3, uv);
else if (texture_index == 19) return textures[19].Sample(image_sampler_3, uv);
else if (texture_index == 20) return textures[20].Sample(image_sampler_3, uv);
else if (texture_index == 21) return textures[21].Sample(image_sampler_3, uv);
else if (texture_index == 22) return textures[22].Sample(image_sampler_3, uv);
else if (texture_index == 23) return textures[23].Sample(image_sampler_3, uv);
else if (texture_index == 24) return textures[24].Sample(image_sampler_3, uv);
else if (texture_index == 25) return textures[25].Sample(image_sampler_3, uv);
else if (texture_index == 26) return textures[26].Sample(image_sampler_3, uv);
else if (texture_index == 27) return textures[27].Sample(image_sampler_3, uv);
else if (texture_index == 28) return textures[28].Sample(image_sampler_3, uv);
else if (texture_index == 29) return textures[29].Sample(image_sampler_3, uv);
else if (texture_index == 30) return textures[30].Sample(image_sampler_3, uv);
else if (texture_index == 31) return textures[31].Sample(image_sampler_3, uv);
}
return float4(1.0, 0.0, 0.0, 1.0);
}
\n
$INJECT_PIXEL_POST_PROCESS
\n
\043define QUAD_TYPE_REGULAR 0\n
\043define QUAD_TYPE_TEXT 1\n
\043define QUAD_TYPE_CIRCLE 2\n
float4 ps_main(PS_INPUT input) : SV_TARGET
{
if (input.has_scissor) {
float2 screen_pos = input.position_screen.xy;
if (screen_pos.x < input.scissor.x || screen_pos.x >= input.scissor.z ||
screen_pos.y < input.scissor.y || screen_pos.y >= input.scissor.w)
discard;
}
if (input.type == QUAD_TYPE_REGULAR) {
if (input.texture_index >= 0 && input.texture_index < 32 && input.sampler_index >= 0 && input.sampler_index <= 3) {
return pixel_shader_extension(input, sample_texture(input.texture_index, input.sampler_index, input.uv)*input.color);
} else {
return pixel_shader_extension(input, input.color);
}
} else if (input.type == QUAD_TYPE_TEXT) {
if (input.texture_index >= 0 && input.texture_index < 32 && input.sampler_index >= 0 && input.sampler_index <= 3) {
float alpha = sample_texture(input.texture_index, input.sampler_index, input.uv).x;
return pixel_shader_extension(input, float4(1.0, 1.0, 1.0, alpha)*input.color);
} else {
return pixel_shader_extension(input, input.color);
}
} else if (input.type == QUAD_TYPE_CIRCLE) {
float dist = length(input.self_uv-float2(0.5, 0.5));
if (dist > 0.5) return float4(0.0, 0.0, 0.0, 0.0);
if (input.texture_index >= 0 && input.texture_index < 32 && input.sampler_index >= 0 && input.sampler_index <= 3) {
return pixel_shader_extension(input, sample_texture(input.texture_index, input.sampler_index, input.uv)*input.color);
} else {
return pixel_shader_extension(input, input.color);
}
}
return float4(1.0, 1.0, 0.0, 1.0);
}
);

View file

@ -15,9 +15,16 @@
#error "Unknown renderer GFX_RENDERER defined" #error "Unknown renderer GFX_RENDERER defined"
#endif #endif
#ifndef VERTEX_2D_USER_DATA_COUNT
#define VERTEX_2D_USER_DATA_COUNT 1
#endif
forward_global const Gfx_Handle GFX_INVALID_HANDLE; forward_global const Gfx_Handle GFX_INVALID_HANDLE;
// #Volatile reflected in 2D batch shader
#define QUAD_TYPE_REGULAR 0 #define QUAD_TYPE_REGULAR 0
#define QUAD_TYPE_TEXT 1 #define QUAD_TYPE_TEXT 1
#define QUAD_TYPE_CIRCLE 2
typedef enum Gfx_Filter_Mode { typedef enum Gfx_Filter_Mode {
GFX_FILTER_MODE_NEAREST, GFX_FILTER_MODE_NEAREST,
@ -30,17 +37,27 @@ typedef struct Gfx_Image {
Allocator allocator; Allocator allocator;
} Gfx_Image; } Gfx_Image;
Gfx_Image *make_image(u32 width, u32 height, u32 channels, void *initial_data, Allocator allocator); Gfx_Image *
Gfx_Image *load_image_from_disk(string path, Allocator allocator); make_image(u32 width, u32 height, u32 channels, void *initial_data, Allocator allocator);
void delete_image(Gfx_Image *image); Gfx_Image *
load_image_from_disk(string path, Allocator allocator);
void
delete_image(Gfx_Image *image);
// Implemented per renderer // Implemented per renderer
void gfx_init_image(Gfx_Image *image, void *data); void
void gfx_set_image_data(Gfx_Image *image, u32 x, u32 y, u32 w, u32 h, void *data); gfx_init_image(Gfx_Image *image, void *data);
void gfx_deinit_image(Gfx_Image *image); void
gfx_set_image_data(Gfx_Image *image, u32 x, u32 y, u32 w, u32 h, void *data);
void
gfx_deinit_image(Gfx_Image *image);
bool
shader_recompile_with_extension(string ext_source, u64 cbuffer_size);
// initial_data can be null to leave image data uninitialized // initial_data can be null to leave image data uninitialized
Gfx_Image *make_image(u32 width, u32 height, u32 channels, void *initial_data, Allocator allocator) { Gfx_Image *
make_image(u32 width, u32 height, u32 channels, void *initial_data, Allocator allocator) {
Gfx_Image *image = alloc(allocator, sizeof(Gfx_Image) + width*height*channels); Gfx_Image *image = alloc(allocator, sizeof(Gfx_Image) + width*height*channels);
assert(channels > 0 && channels <= 4, "Only 1, 2, 3 or 4 channels allowed on images. Got %d", channels); assert(channels > 0 && channels <= 4, "Only 1, 2, 3 or 4 channels allowed on images. Got %d", channels);
@ -56,7 +73,8 @@ Gfx_Image *make_image(u32 width, u32 height, u32 channels, void *initial_data, A
return image; return image;
} }
Gfx_Image *load_image_from_disk(string path, Allocator allocator) { Gfx_Image *
load_image_from_disk(string path, Allocator allocator) {
string png; string png;
bool ok = os_read_entire_file(path, &png, allocator); bool ok = os_read_entire_file(path, &png, allocator);
if (!ok) return 0; if (!ok) return 0;
@ -92,7 +110,8 @@ Gfx_Image *load_image_from_disk(string path, Allocator allocator) {
return image; return image;
} }
void delete_image(Gfx_Image *image) { void
delete_image(Gfx_Image *image) {
// Free the image data allocated by stb_image // Free the image data allocated by stb_image
image->width = 0; image->width = 0;
image->height = 0; image->height = 0;

View file

@ -127,22 +127,81 @@ inline Vector4 v4_divf(LMATH_ALIGN Vector4 a, float32 s) {
return v4_div(a, v4(s, s, s, s)); return v4_div(a, v4(s, s, s, s));
} }
// #Simd
inline float32 v2_length(LMATH_ALIGN Vector2 a) {
return sqrt(a.x*a.x + a.y*a.y);
}
inline Vector2 v2_normalize(LMATH_ALIGN Vector2 a) { inline Vector2 v2_normalize(LMATH_ALIGN Vector2 a) {
float32 length = sqrt(a.x * a.x + a.y * a.y); float32 length = v2_length(a);
if (length == 0) { if (length == 0) {
return (Vector2){0, 0}; return (Vector2){0, 0};
} }
return v2_divf(a, length); return v2_divf(a, length);
} }
inline float32 v2_average(LMATH_ALIGN Vector2 a) {
inline float v2_dot_product(LMATH_ALIGN Vector2 a, LMATH_ALIGN Vector2 b) { return (a.x+a.y)/2.0;
}
inline Vector2 v2_abs(LMATH_ALIGN Vector2 a) {
return v2(fabsf(a.x), fabsf(a.y));
}
inline float32 v2_cross(LMATH_ALIGN Vector2 a, LMATH_ALIGN Vector2 b) {
return (a.x * b.y) - (a.y * b.x);
}
inline float v2_dot(LMATH_ALIGN Vector2 a, LMATH_ALIGN Vector2 b) {
return simd_dot_product_float32_64((float*)&a, (float*)&b); return simd_dot_product_float32_64((float*)&a, (float*)&b);
} }
inline float v3_dot_product(LMATH_ALIGN Vector3 a, LMATH_ALIGN Vector3 b) {
inline float32 v3_length(LMATH_ALIGN Vector3 a) {
return sqrt(a.x * a.x + a.y * a.y + a.z * a.z);
}
inline Vector3 v3_normalize(LMATH_ALIGN Vector3 a) {
float32 length = v3_length(a);
if (length == 0) {
return (Vector3){0, 0, 0};
}
return v3_divf(a, length);
}
inline float32 v3_average(LMATH_ALIGN Vector3 a) {
return (a.x + a.y + a.z) / 3.0;
}
inline Vector3 v3_abs(LMATH_ALIGN Vector3 a) {
return v3(fabsf(a.x), fabsf(a.y), fabsf(a.z));
}
inline Vector3 v3_cross(LMATH_ALIGN Vector3 a, LMATH_ALIGN Vector3 b) {
return (Vector3){
(a.y * b.z) - (a.z * b.y),
(a.z * b.x) - (a.x * b.z),
(a.x * b.y) - (a.y * b.x)
};
}
inline float v3_dot(LMATH_ALIGN Vector3 a, LMATH_ALIGN Vector3 b) {
return simd_dot_product_float32_96((float*)&a, (float*)&b); return simd_dot_product_float32_96((float*)&a, (float*)&b);
} }
inline float v4_dot_product(LMATH_ALIGN Vector4 a, LMATH_ALIGN Vector4 b) { inline float32 v4_length(LMATH_ALIGN Vector4 a) {
return sqrt(a.x * a.x + a.y * a.y + a.z * a.z + a.w * a.w);
}
inline Vector4 v4_normalize(LMATH_ALIGN Vector4 a) {
float32 length = v4_length(a);
if (length == 0) {
return (Vector4){0, 0, 0, 0};
}
return v4_divf(a, length);
}
inline float32 v4_average(LMATH_ALIGN Vector4 a) {
return (a.x + a.y + a.z + a.w) / 4.0;
}
inline Vector4 v4_abs(LMATH_ALIGN Vector4 a) {
return v4(fabsf(a.x), fabsf(a.y), fabsf(a.z), fabsf(a.w));
}
inline float v4_dot(LMATH_ALIGN Vector4 a, LMATH_ALIGN Vector4 b) {
return simd_dot_product_float32_128_aligned((float*)&a, (float*)&b); return simd_dot_product_float32_128_aligned((float*)&a, (float*)&b);
} }

View file

@ -71,6 +71,10 @@ typedef struct Heap_Block {
void* start; void* start;
Heap_Block *next; Heap_Block *next;
// 32 bytes !! // 32 bytes !!
#if CONFIGURATION == DEBUG
u64 total_allocated;
u64 padding;
#endif
} Heap_Block; } Heap_Block;
#define HEAP_META_SIGNATURE 6969694206942069ull #define HEAP_META_SIGNATURE 6969694206942069ull
@ -117,6 +121,7 @@ void sanity_check_block(Heap_Block *block) {
assert(is_pointer_in_program_memory(block->start), "Heap_Block pointer is corrupt"); assert(is_pointer_in_program_memory(block->start), "Heap_Block pointer is corrupt");
if(block->next) { assert(is_pointer_in_program_memory(block->next), "Heap_Block next pointer is corrupt"); } if(block->next) { assert(is_pointer_in_program_memory(block->next), "Heap_Block next pointer is corrupt"); }
assert(block->size < GB(256), "A heap block is corrupt."); assert(block->size < GB(256), "A heap block is corrupt.");
assert(block->size >= INITIAL_PROGRAM_MEMORY_SIZE, "A heap block is corrupt.");
assert((u64)block->start == (u64)block + sizeof(Heap_Block), "A heap block is corrupt."); assert((u64)block->start == (u64)block + sizeof(Heap_Block), "A heap block is corrupt.");
@ -139,6 +144,8 @@ void sanity_check_block(Heap_Block *block) {
node = node->next; node = node->next;
} }
u64 expected_size = get_heap_block_size_excluding_metadata(block);
assert(block->total_allocated+total_free == expected_size, "Heap is corrupt.")
} }
inline void check_meta(Heap_Allocation_Metadata *meta) { inline void check_meta(Heap_Allocation_Metadata *meta) {
#if CONFIGURATION == DEBUG #if CONFIGURATION == DEBUG
@ -215,10 +222,10 @@ Heap_Block *make_heap_block(Heap_Block *parent, u64 size) {
} else { } else {
block = (Heap_Block*)program_memory; block = (Heap_Block*)program_memory;
} }
block->total_allocated = 0;
// #Speed #Cleanup
if (((u8*)block)+size >= ((u8*)program_memory)+program_memory_size) { if (((u8*)block)+size >= ((u8*)program_memory)+program_memory_size) {
u64 minimum_size = ((u8*)block+size) - (u8*)program_memory + 1; u64 minimum_size = ((u8*)block+size) - (u8*)program_memory + 1;
u64 new_program_size = get_next_power_of_two(minimum_size); u64 new_program_size = get_next_power_of_two(minimum_size);
@ -270,6 +277,18 @@ void *heap_alloc(u64 size) {
assert(size < MAX_HEAP_BLOCK_SIZE, "Past Charlie has been lazy and did not handle large allocations like this. I apologize on behalf of past Charlie. A quick fix could be to increase the heap block size for now. #Incomplete #Limitation"); assert(size < MAX_HEAP_BLOCK_SIZE, "Past Charlie has been lazy and did not handle large allocations like this. I apologize on behalf of past Charlie. A quick fix could be to increase the heap block size for now. #Incomplete #Limitation");
#if VERY_DEBUG
{
Heap_Block *block = heap_head;
while (block != 0) {
sanity_check_block(block);
block = block->next;
}
}
#endif
Heap_Block *block = heap_head; Heap_Block *block = heap_head;
Heap_Block *last_block = 0; Heap_Block *last_block = 0;
Heap_Free_Node *best_fit = 0; Heap_Free_Node *best_fit = 0;
@ -280,10 +299,6 @@ void *heap_alloc(u64 size) {
// Maybe instead of going through EVERY free node to find best fit we do a good-enough fit // Maybe instead of going through EVERY free node to find best fit we do a good-enough fit
while (block != 0) { while (block != 0) {
#if VERY_DEBUG
sanity_check_block(block);
#endif
if (get_heap_block_size_excluding_metadata(block) < size) { if (get_heap_block_size_excluding_metadata(block) < size) {
last_block = block; last_block = block;
block = block->next; block = block->next;
@ -355,6 +370,7 @@ void *heap_alloc(u64 size) {
meta->block = best_fit_block; meta->block = best_fit_block;
#if CONFIGURATION == DEBUG #if CONFIGURATION == DEBUG
meta->signature = HEAP_META_SIGNATURE; meta->signature = HEAP_META_SIGNATURE;
meta->block->total_allocated += size;
#endif #endif
check_meta(meta); check_meta(meta);
@ -450,6 +466,10 @@ void heap_dealloc(void *p) {
} }
#if CONFIGURATION == DEBUG
block->total_allocated -= size;
#endif
#if VERY_DEBUG #if VERY_DEBUG
sanity_check_block(block); sanity_check_block(block);
#endif #endif

View file

@ -107,7 +107,7 @@
#define OGB_VERSION_MAJOR 0 #define OGB_VERSION_MAJOR 0
#define OGB_VERSION_MINOR 1 #define OGB_VERSION_MINOR 1
#define OGB_VERSION_PATCH 0 #define OGB_VERSION_PATCH 1
#define OGB_VERSION (OGB_VERSION_MAJOR*1000000+OGB_VERSION_MINOR*1000+OGB_VERSION_PATCH) #define OGB_VERSION (OGB_VERSION_MAJOR*1000000+OGB_VERSION_MINOR*1000+OGB_VERSION_PATCH)

View file

@ -255,22 +255,17 @@ void os_init(u64 program_memory_size) {
assert(os.crt != 0, "Could not load win32 crt library. Might be compiled with non-msvc? #Incomplete #Portability"); assert(os.crt != 0, "Could not load win32 crt library. Might be compiled with non-msvc? #Incomplete #Portability");
os.crt_vsnprintf = (Crt_Vsnprintf_Proc)os_dynamic_library_load_symbol(os.crt, STR("vsnprintf")); os.crt_vsnprintf = (Crt_Vsnprintf_Proc)os_dynamic_library_load_symbol(os.crt, STR("vsnprintf"));
assert(os.crt_vsnprintf, "Missing vsnprintf in crt"); assert(os.crt_vsnprintf, "Missing vsnprintf in crt");
os.crt_vprintf = (Crt_Vprintf_Proc)os_dynamic_library_load_symbol(os.crt, STR("vprintf"));
assert(os.crt_vprintf, "Missing vprintf in crt");
os.crt_vsprintf = (Crt_Vsprintf_Proc)os_dynamic_library_load_symbol(os.crt, STR("vsprintf"));
assert(os.crt_vsprintf, "Missing vsprintf in crt");
os.crt_memcpy = (Crt_Memcpy_Proc)os_dynamic_library_load_symbol(os.crt, STR("memcpy"));
assert(os.crt_memcpy, "Missing memcpy in crt");
os.crt_memcmp = (Crt_Memcmp_Proc)os_dynamic_library_load_symbol(os.crt, STR("memcmp"));
assert(os.crt_memcmp, "Missing crt_memcmp in crt");
os.crt_memset = (Crt_Memset_Proc)os_dynamic_library_load_symbol(os.crt, STR("memset"));
assert(os.crt_memset, "Missing memset in crt");
win32_init_window(); win32_init_window();
os_start_thread(os_make_thread(win32_audio_thread, get_heap_allocator())); local_persist Thread audio_thread, audio_poll_default_device_thread;
os_start_thread(os_make_thread(win32_audio_poll_default_device_thread, get_heap_allocator()));
os_thread_init(&audio_thread, win32_audio_thread);
os_thread_init(&audio_poll_default_device_thread, win32_audio_poll_default_device_thread);
os_thread_start(&audio_thread);
os_thread_start(&audio_poll_default_device_thread);
while (!win32_has_audio_thread_started) { os_yield_thread(); } while (!win32_has_audio_thread_started) { os_yield_thread(); }
} }
@ -341,6 +336,9 @@ bool os_grow_program_memory(u64 new_size) {
memset(program_memory, 0xBA, program_memory_size); memset(program_memory, 0xBA, program_memory_size);
} else { } else {
// #Cleanup this mess
// Allocation size doesn't actually need to be aligned to granularity, page size is enough.
// Doesn't matter that much tho, but this is just a bit unfortunate to look at.
void* tail = (u8*)program_memory + program_memory_size; void* tail = (u8*)program_memory + program_memory_size;
u64 m = ((u64)program_memory_size % os.granularity); u64 m = ((u64)program_memory_size % os.granularity);
assert(m == 0, "program_memory_size is not aligned to granularity!"); assert(m == 0, "program_memory_size is not aligned to granularity!");
@ -404,6 +402,8 @@ DWORD WINAPI win32_thread_invoker(LPVOID param) {
return 0; return 0;
} }
////// DEPRECATED vvvvvvvvvvvvvvvvv
Thread* os_make_thread(Thread_Proc proc, Allocator allocator) { Thread* os_make_thread(Thread_Proc proc, Allocator allocator) {
Thread *t = (Thread*)alloc(allocator, sizeof(Thread)); Thread *t = (Thread*)alloc(allocator, sizeof(Thread));
t->id = 0; // This is set when we start it t->id = 0; // This is set when we start it
@ -433,6 +433,33 @@ void os_start_thread(Thread *t) {
void os_join_thread(Thread *t) { void os_join_thread(Thread *t) {
WaitForSingleObject(t->os_handle, INFINITE); WaitForSingleObject(t->os_handle, INFINITE);
} }
////// DEPRECATED ^^^^^^^^^^^^^^^^
void os_thread_init(Thread *t, Thread_Proc proc) {
memset(t, 0, sizeof(Thread));
t->id = 0;
t->proc = proc;
t->initial_context = context;
}
void os_thread_destroy(Thread *t) {
os_thread_join(t);
CloseHandle(t->os_handle);
}
void os_thread_start(Thread *t) {
t->os_handle = CreateThread(
0,
0,
win32_thread_invoker,
t,
0,
(DWORD*)&t->id
);
assert(t->os_handle, "Failed creating thread");
}
void os_thread_join(Thread *t) {
WaitForSingleObject(t->os_handle, INFINITE);
}
/// ///
// Mutex primitive // Mutex primitive
@ -474,59 +501,6 @@ void os_unlock_mutex(Mutex_Handle m) {
assert(result, "Unlock mutex 0x%x failed with error %d", m, GetLastError()); assert(result, "Unlock mutex 0x%x failed with error %d", m, GetLastError());
} }
///
// Spinlock "primitive"
Spinlock *os_make_spinlock(Allocator allocator) {
// #Memory #Cleanup do we need to heap allocate this ?
Spinlock *l = cast(Spinlock*)alloc(allocator, sizeof(Spinlock));
l->locked = false;
return l;
}
void os_spinlock_lock(Spinlock *l) {
while (true) {
bool expected = false;
if (compare_and_swap_bool(&l->locked, true, expected)) {
return;
}
while (l->locked) {
// spinny boi
}
}
}
void os_spinlock_unlock(Spinlock *l) {
bool expected = true;
bool success = compare_and_swap_bool(&l->locked, false, expected);
assert(success, "This thread should have acquired the spinlock but compare_and_swap failed");
}
///
// Concurrency utilities
bool os_compare_and_swap_8(u8 *a, u8 b, u8 old) {
// #Portability not sure how portable this is.
return _InterlockedCompareExchange8((volatile CHAR*)a, (CHAR)b, (CHAR)old) == (CHAR)old;
}
bool os_compare_and_swap_16(u16 *a, u16 b, u16 old) {
return InterlockedCompareExchange16((volatile SHORT*)a, (SHORT)b, (SHORT)old) == (SHORT)old;
}
bool os_compare_and_swap_32(u32 *a, u32 b, u32 old) {
return InterlockedCompareExchange((volatile LONG*)a, (LONG)b, (LONG)old) == (LONG)old;
}
bool os_compare_and_swap_64(u64 *a, u64 b, u64 old) {
return InterlockedCompareExchange64((volatile LONG64*)a, (LONG64)b, (LONG64)old) == (LONG64)old;
}
bool os_compare_and_swap_bool(bool *a, bool b, bool old) {
return os_compare_and_swap_8(cast(u8*)a, cast(u8)b, cast(u8)old);
}
void os_sleep(u32 ms) { void os_sleep(u32 ms) {
Sleep(ms); Sleep(ms);
@ -809,6 +783,18 @@ os_file_get_size(File f) {
return result; return result;
} }
s64
os_file_get_size_from_path(string path) {
File f = os_file_open(path, O_READ);
if (f == OS_INVALID_FILE) return -1;
s64 size = os_file_get_size(f);
os_file_close(f);
return size;
}
s64 os_file_get_pos(File f) { s64 os_file_get_pos(File f) {
LARGE_INTEGER pos = {0}; LARGE_INTEGER pos = {0};
LARGE_INTEGER new_pos; LARGE_INTEGER new_pos;

View file

@ -32,13 +32,7 @@
#define _INTSIZEOF(n) ((sizeof(n) + sizeof(int) - 1) & ~(sizeof(int) - 1)) #define _INTSIZEOF(n) ((sizeof(n) + sizeof(int) - 1) & ~(sizeof(int) - 1))
// #Cleanup we only need vsnprintf
typedef void* (__cdecl *Crt_Memcpy_Proc) (void*, const void*, size_t);
typedef int (__cdecl *Crt_Memcmp_Proc) (const void*, const void*, size_t);
typedef void* (__cdecl *Crt_Memset_Proc) (void*, int, size_t);
typedef int (__cdecl *Crt_Vprintf_Proc) (const char*, va_list);
typedef int (__cdecl *Crt_Vsnprintf_Proc) (char*, size_t, const char*, va_list); typedef int (__cdecl *Crt_Vsnprintf_Proc) (char*, size_t, const char*, va_list);
typedef int (__cdecl *Crt_Vsprintf_Proc) (char*, const char*, va_list);
typedef struct Os_Info { typedef struct Os_Info {
u64 page_size; u64 page_size;
@ -46,37 +40,19 @@ typedef struct Os_Info {
Dynamic_Library_Handle crt; Dynamic_Library_Handle crt;
// #Cleanup we only need vsnprintf
Crt_Memcpy_Proc crt_memcpy;
Crt_Memcmp_Proc crt_memcmp;
Crt_Memset_Proc crt_memset;
Crt_Vprintf_Proc crt_vprintf;
Crt_Vsnprintf_Proc crt_vsnprintf; Crt_Vsnprintf_Proc crt_vsnprintf;
Crt_Vsprintf_Proc crt_vsprintf;
void *static_memory_start, *static_memory_end; void *static_memory_start, *static_memory_end;
} Os_Info; } Os_Info;
Os_Info os; Os_Info os;
inline int crt_vprintf(const char* fmt, va_list args) {
return os.crt_vprintf(fmt, args);
}
inline bool bytes_match(void *a, void *b, u64 count) { return memcmp(a, b, count) == 0; } inline bool bytes_match(void *a, void *b, u64 count) { return memcmp(a, b, count) == 0; }
inline int vsnprintf(char* buffer, size_t n, const char* fmt, va_list args) { inline int vsnprintf(char* buffer, size_t n, const char* fmt, va_list args) {
return os.crt_vsnprintf(buffer, n, fmt, args); return os.crt_vsnprintf(buffer, n, fmt, args);
} }
inline int crt_sprintf(char *str, const char *format, ...) {
va_list args;
va_start(args, format);
int r = os.crt_vsprintf(str, format, args);
va_end(args);
return r;
}
Mutex_Handle program_memory_mutex = 0; Mutex_Handle program_memory_mutex = 0;
bool os_grow_program_memory(size_t new_size); bool os_grow_program_memory(size_t new_size);
@ -91,22 +67,25 @@ typedef struct Thread Thread;
typedef void(*Thread_Proc)(Thread*); typedef void(*Thread_Proc)(Thread*);
typedef struct Thread { typedef struct Thread {
u64 id; u64 id; // This is valid after os_thread_start
Context initial_context; Context initial_context;
void* data; void* data;
Thread_Proc proc; Thread_Proc proc;
Thread_Handle os_handle; Thread_Handle os_handle;
Allocator allocator; Allocator allocator; // Deprecated !! #Cleanup
} Thread; } Thread;
/// ///
// Thread primitive // Thread primitive
// #Cleanup this shouldn't be allocating just for the pointer!! Just do os_thread_init(*) DEPRECATED(Thread* os_make_thread(Thread_Proc proc, Allocator allocator), "Use os_thread_init instead");
Thread* os_make_thread(Thread_Proc proc, Allocator allocator); DEPRECATED(void os_destroy_thread(Thread *t), "Use os_thread_destroy instead");
void os_destroy_thread(Thread *t); DEPRECATED(void os_start_thread(Thread* t), "Use os_thread_start instead");
void os_start_thread(Thread* t); DEPRECATED(void os_join_thread(Thread* t), "Use os_thread_join instead");
void os_join_thread(Thread* t);
void os_thread_init(Thread *t, Thread_Proc proc);
void os_thread_destroy(Thread *t);
void os_thread_start(Thread *t);
void os_thread_join(Thread *t);
/// ///
@ -116,27 +95,8 @@ void os_destroy_mutex(Mutex_Handle m);
void os_lock_mutex(Mutex_Handle m); void os_lock_mutex(Mutex_Handle m);
void os_unlock_mutex(Mutex_Handle m); void os_unlock_mutex(Mutex_Handle m);
typedef struct Spinlock Spinlock;
// #Cleanup Moved to threading.c
DEPRECATED(Spinlock *os_make_spinlock(Allocator allocator), "use spinlock_init instead");
DEPRECATED(void os_spinlock_lock(Spinlock* l), "use spinlock_acquire_or_wait instead");
DEPRECATED(void os_spinlock_unlock(Spinlock* l), "use spinlock_release instead");
/// ///
// Concurrency utilities // Threading utilities
// #Cleanup
// In retrospect, I'm not sure why I choose to implement this per OS.
// I think Win32 InterlockedCompareExchange just generates the cmpxchg
// instruction anyways, so may as well just inline asm it (or Win32
// if we're compiling with msvc) (LDREX/STREX on ARM)
// - CharlieM July 8th 2024
// compare_and_swap in cpu.c
DEPRECATED(bool os_compare_and_swap_8 (u8 *a, u8 b, u8 old), "use compare_and_swap instead");
DEPRECATED(bool os_compare_and_swap_16 (u16 *a, u16 b, u16 old), "use compare_and_swap instead");
DEPRECATED(bool os_compare_and_swap_32 (u32 *a, u32 b, u32 old), "use compare_and_swap instead");
DEPRECATED(bool os_compare_and_swap_64 (u64 *a, u64 b, u64 old), "use compare_and_swap instead");
DEPRECATED(bool os_compare_and_swap_bool(bool *a, bool b, bool old), "use compare_and_swap instead");
void os_sleep(u32 ms); void os_sleep(u32 ms);
void os_yield_thread(); void os_yield_thread();
@ -147,8 +107,7 @@ void os_high_precision_sleep(f64 ms);
// Time // Time
/// ///
// #Cleanup getting the cycle count is an x86 intrinsic so this should be in cpu.c DEPRECATED(u64 os_get_current_cycle_count(), "use rdtsc() instead");
u64 os_get_current_cycle_count();
float64 os_get_current_time_in_seconds(); float64 os_get_current_time_in_seconds();
@ -196,6 +155,7 @@ bool os_file_set_pos(File f, s64 pos_in_bytes);
s64 os_file_get_pos(File f); s64 os_file_get_pos(File f);
s64 os_file_get_size(File f); s64 os_file_get_size(File f);
s64 os_file_get_size_from_path(string path);
bool os_write_entire_file_handle(File f, string data); bool os_write_entire_file_handle(File f, string data);
bool os_write_entire_file_s(string path, string data); bool os_write_entire_file_s(string path, string data);
@ -332,8 +292,8 @@ typedef struct Os_Window {
string title; string title;
union { s32 width; s32 pixel_width; }; union { s32 width; s32 pixel_width; };
union { s32 height; s32 pixel_height; }; union { s32 height; s32 pixel_height; };
s32 scaled_width; s32 scaled_width; // DPI scaled!
s32 scaled_height; s32 scaled_height; // DPI scaled!
s32 x; s32 x;
s32 y; s32 y;
Vector4 clear_color; Vector4 clear_color;

View file

@ -17,7 +17,8 @@ const string null_string = {0, 0};
#define fixed_string STR #define fixed_string STR
#define STR(s) ((string){ length_of_null_terminated_string((const char*)s), (u8*)s }) #define STR(s) ((string){ length_of_null_terminated_string((const char*)s), (u8*)s })
inline u64 length_of_null_terminated_string(const char* cstring) { inline u64
length_of_null_terminated_string(const char* cstring) {
u64 len = 0; u64 len = 0;
while (*cstring != 0) { while (*cstring != 0) {
len += 1; len += 1;
@ -26,22 +27,26 @@ inline u64 length_of_null_terminated_string(const char* cstring) {
return len; return len;
} }
string alloc_string(Allocator allocator, u64 count) { string
alloc_string(Allocator allocator, u64 count) {
string s; string s;
s.count = count; s.count = count;
s.data = cast(u8*)alloc(allocator, count); s.data = cast(u8*)alloc(allocator, count);
return s; return s;
} }
void dealloc_string(Allocator allocator, string s) { void
dealloc_string(Allocator allocator, string s) {
assert(s.count > 0 && s.data, "You tried to deallocate an empty string. That's doesn't make sense."); assert(s.count > 0 && s.data, "You tried to deallocate an empty string. That's doesn't make sense.");
dealloc(allocator, s.data); dealloc(allocator, s.data);
} }
string talloc_string(u64 count) { string
talloc_string(u64 count) {
string s = alloc_string(temp, count); string s = alloc_string(temp, count);
return s; return s;
} }
string string_concat(const string left, const string right, Allocator allocator) { string
string_concat(const string left, const string right, Allocator allocator) {
if (right.count + left.count == 0) return null_string; if (right.count + left.count == 0) return null_string;
if (left.count == 0) return right; if (left.count == 0) return right;
@ -54,18 +59,21 @@ string string_concat(const string left, const string right, Allocator allocator)
memcpy(result.data+left.count, right.data, right.count); memcpy(result.data+left.count, right.data, right.count);
return result; return result;
} }
char *convert_to_null_terminated_string(const string s, Allocator allocator) { char *
convert_to_null_terminated_string(const string s, Allocator allocator) {
char *cstring = cast(char*)alloc(allocator, s.count+1); char *cstring = cast(char*)alloc(allocator, s.count+1);
memcpy(cstring, s.data, s.count); memcpy(cstring, s.data, s.count);
cstring[s.count] = 0; cstring[s.count] = 0;
return cstring; return cstring;
} }
char *temp_convert_to_null_terminated_string(const string s) { char *
temp_convert_to_null_terminated_string(const string s) {
char *c = convert_to_null_terminated_string(s, temp); char *c = convert_to_null_terminated_string(s, temp);
return c; return c;
} }
bool strings_match(string a, string b) { bool
strings_match(string a, string b) {
if (a.count != b.count) return false; if (a.count != b.count) return false;
// Count match, pointer match: they are the same // Count match, pointer match: they are the same
@ -74,7 +82,8 @@ bool strings_match(string a, string b) {
return memcmp(a.data, b.data, a.count) == 0; return memcmp(a.data, b.data, a.count) == 0;
} }
string string_view(string s, u64 start_index, u64 count) { string
string_view(string s, u64 start_index, u64 count) {
assert(start_index < s.count, "array_view start_index % out of range for string count %", start_index, s.count); assert(start_index < s.count, "array_view start_index % out of range for string count %", start_index, s.count);
assert(count > 0, "array_view count must be more than 0"); assert(count > 0, "array_view count must be more than 0");
assert(start_index + count <= s.count, "array_view start_index + count is out of range"); assert(start_index + count <= s.count, "array_view start_index + count is out of range");
@ -87,7 +96,8 @@ string string_view(string s, u64 start_index, u64 count) {
} }
// Returns first index from left where "sub" matches in "s". Returns -1 if no match is found. // Returns first index from left where "sub" matches in "s". Returns -1 if no match is found.
s64 string_find_from_left(string s, string sub) { s64
string_find_from_left(string s, string sub) {
for (s64 i = 0; i <= s.count-sub.count; i++) { for (s64 i = 0; i <= s.count-sub.count; i++) {
if (strings_match(string_view(s, i, sub.count), sub)) { if (strings_match(string_view(s, i, sub.count), sub)) {
return i; return i;
@ -98,7 +108,8 @@ s64 string_find_from_left(string s, string sub) {
} }
// Returns first index from right where "sub" matches in "s" Returns -1 if no match is found. // Returns first index from right where "sub" matches in "s" Returns -1 if no match is found.
s64 string_find_from_right(string s, string sub) { s64
string_find_from_right(string s, string sub) {
for (s64 i = s.count-sub.count; i >= 0 ; i--) { for (s64 i = s.count-sub.count; i >= 0 ; i--) {
if (strings_match(string_view(s, i, sub.count), sub)) { if (strings_match(string_view(s, i, sub.count), sub)) {
return i; return i;
@ -108,10 +119,93 @@ s64 string_find_from_right(string s, string sub) {
return -1; return -1;
} }
bool string_starts_with(string s, string sub) { bool
string_starts_with(string s, string sub) {
if (s.count < sub.count) return false; if (s.count < sub.count) return false;
s.count = sub.count; s.count = sub.count;
return strings_match(s, sub); return strings_match(s, sub);
} }
string
string_copy(string s, Allocator allocator) {
string c = alloc_string(allocator, s.count);
memcpy(c.data, s.data, s.count);
return c;
}
typedef struct String_Builder {
union {
struct {u64 count;u8 *buffer;};
string result;
};
u64 buffer_capacity;
Allocator allocator;
} String_Builder;
void
string_builder_reserve(String_Builder *b, u64 required_capacity) {
if (b->buffer_capacity >= required_capacity) return;
u64 new_capacity = max(b->buffer_capacity*2, (u64)(required_capacity*1.5));
u8 *new_buffer = alloc(b->allocator, new_capacity);
if (b->buffer) {
memcpy(new_buffer, b->buffer, b->count);
dealloc(b->allocator, b->buffer);
}
b->buffer = new_buffer;
b->buffer_capacity = new_capacity;
}
void
string_builder_init_reserve(String_Builder *b, u64 reserved_capacity, Allocator allocator) {
reserved_capacity = max(reserved_capacity, 128);
b->allocator = allocator;
b->buffer_capacity = 0;
b->buffer = 0;
string_builder_reserve(b, reserved_capacity);
b->count = 0;
}
void
string_builder_init(String_Builder *b, Allocator allocator) {
string_builder_init_reserve(b, 128, allocator);
}
void
string_builder_append(String_Builder *b, string s) {
assert(b->allocator.proc, "String_Builder is missing allocator");
string_builder_reserve(b, b->count+s.count);
memcpy(b->buffer+b->count, s.data, s.count);
b->count += s.count;
}
string
string_builder_get_string(String_Builder b) {
return b.result;
}
string
string_replace_all(string s, string old, string new, Allocator allocator) {
if (!s.data || !s.count) return string_copy(null_string, allocator);
String_Builder builder;
string_builder_init_reserve(&builder, s.count, allocator);
while (s.count > 0) {
if (s.count >= old.count && strings_match(string_view(s, 0, old.count), old)) {
if (new.count != 0) string_builder_append(&builder, new);
s.data += old.count;
s.count -= old.count;
} else {
string_builder_append(&builder, string_view(s, 0, 1));
s.data += 1;
s.count -= 1;
}
}
return string_builder_get_string(builder);
}

View file

@ -167,7 +167,7 @@ string tprintf(const char *fmt, ...) {
return s; return s;
} }
// print for 'string' and printf for 'char*' // prints for 'string' and printf for 'char*'
#define PRINT_BUFFER_SIZE 4096 #define PRINT_BUFFER_SIZE 4096
// Avoids all and any allocations but overhead in speed and memory. // Avoids all and any allocations but overhead in speed and memory.
@ -231,47 +231,8 @@ typedef void(*Logger_Proc)(Log_Level level, string s);
#define log(...) LOG_BASE(LOG_INFO, __VA_ARGS__) #define log(...) LOG_BASE(LOG_INFO, __VA_ARGS__)
typedef struct String_Builder {
union {
struct {u64 count;u8 *buffer;};
string result;
};
u64 buffer_capacity;
Allocator allocator;
} String_Builder;
void string_builder_reserve(String_Builder *b, u64 required_capacity) {
if (b->buffer_capacity >= required_capacity) return;
u64 new_capacity = max(b->buffer_capacity*2, (u64)(required_capacity*1.5));
u8 *new_buffer = alloc(b->allocator, new_capacity);
if (b->buffer) {
memcpy(new_buffer, b->buffer, b->count);
dealloc(b->allocator, b->buffer);
}
b->buffer = new_buffer;
b->buffer_capacity = new_capacity;
}
void string_builder_init_reserve(String_Builder *b, u64 reserved_capacity, Allocator allocator) {
reserved_capacity = max(reserved_capacity, 128);
b->allocator = allocator;
b->buffer_capacity = 0;
b->buffer = 0;
string_builder_reserve(b, reserved_capacity);
b->count = 0;
}
void string_builder_init(String_Builder *b, Allocator allocator) {
string_builder_init_reserve(b, 128, allocator);
}
void string_builder_append(String_Builder *b, string s) {
assert(b->allocator.proc, "String_Builder is missing allocator");
string_builder_reserve(b, b->count+s.count);
memcpy(b->buffer+b->count, s.data, s.count);
b->count += s.count;
}
void string_builder_prints(String_Builder *b, string fmt, ...) { void string_builder_prints(String_Builder *b, string fmt, ...) {
assert(b->allocator.proc, "String_Builder is missing allocator"); assert(b->allocator.proc, "String_Builder is missing allocator");
@ -315,8 +276,3 @@ void string_builder_printf(String_Builder *b, const char *fmt, ...) {
string: string_builder_prints, \ string: string_builder_prints, \
default: string_builder_printf \ default: string_builder_printf \
)(__VA_ARGS__) )(__VA_ARGS__)
string string_builder_get_string(String_Builder *b) {
return b->result;
}

View file

@ -212,11 +212,12 @@ void test_thread_proc1(Thread* t) {
void test_threads() { void test_threads() {
Thread* t = os_make_thread(test_thread_proc1, get_heap_allocator()); Thread t;
os_start_thread(t); os_thread_init(&t, test_thread_proc1);
os_thread_start(&t);
os_sleep(20); os_sleep(20);
print("This should be printed in middle of thread execution\n"); print("This should be printed in middle of thread execution\n");
os_join_thread(t); os_thread_join(&t);
print("Thread is joined\n"); print("Thread is joined\n");
Mutex_Handle m = os_make_mutex(); Mutex_Handle m = os_make_mutex();
@ -394,7 +395,7 @@ void test_strings() {
assert(memcmp(builder.buffer, expected_result, builder.count) == 0, "Failed: string_builder_printf"); assert(memcmp(builder.buffer, expected_result, builder.count) == 0, "Failed: string_builder_printf");
// Test string_builder_get_string // Test string_builder_get_string
string result_str = string_builder_get_string(&builder); string result_str = string_builder_get_string(builder);
assert(result_str.count == builder.count, "Failed: string_builder_get_string"); assert(result_str.count == builder.count, "Failed: string_builder_get_string");
assert(memcmp(result_str.data, builder.buffer, result_str.count) == 0, "Failed: string_builder_get_string"); assert(memcmp(result_str.data, builder.buffer, result_str.count) == 0, "Failed: string_builder_get_string");
@ -404,7 +405,7 @@ void test_strings() {
// Test handling of empty builder // Test handling of empty builder
String_Builder empty_builder; String_Builder empty_builder;
string_builder_init(&empty_builder, heap); string_builder_init(&empty_builder, heap);
result_str = string_builder_get_string(&empty_builder); result_str = string_builder_get_string(empty_builder);
assert(result_str.count == 0, "Failed: empty builder handling"); assert(result_str.count == 0, "Failed: empty builder handling");
dealloc(heap, empty_builder.buffer); dealloc(heap, empty_builder.buffer);
@ -440,6 +441,15 @@ void test_strings() {
assert(multi_append_builder.count == strlen(expected_result), "Failed: multiple appends"); assert(multi_append_builder.count == strlen(expected_result), "Failed: multiple appends");
assert(memcmp(multi_append_builder.buffer, expected_result, multi_append_builder.count) == 0, "Failed: multiple appends"); assert(memcmp(multi_append_builder.buffer, expected_result, multi_append_builder.count) == 0, "Failed: multiple appends");
dealloc(heap, multi_append_builder.buffer); dealloc(heap, multi_append_builder.buffer);
string cheese_hello = STR("HeCHEESElloCHEESE, WorCHEESEld!");
string hello = string_replace_all(cheese_hello, STR("CHEESE"), STR(""), heap);
assert(strings_match(hello, STR("Hello, World!")), "Failed: string_replace");
string hello_balls = string_replace_all(hello, STR("Hello"), STR("Greetings"), heap);
hello_balls = string_replace_all(hello_balls, STR("World"), STR("Balls"), heap);
assert(strings_match(hello_balls, STR("Greetings, Balls!")), "Failed: string_replace");
} }
void test_file_io() { void test_file_io() {
@ -767,65 +777,65 @@ void test_simd() {
memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float));
memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float));
u64 start = os_get_current_cycle_count(); u64 start = rdtsc();
for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 16) { for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 16) {
simd_mul_float32_512_aligned(&samples_a[i], &samples_b[i], &samples_a[i]); simd_mul_float32_512_aligned(&samples_a[i], &samples_b[i], &samples_a[i]);
} }
u64 end = os_get_current_cycle_count(); u64 end = rdtsc();
u64 cycles = end-start; u64 cycles = end-start;
print("simd 512 float32 mul took %llu cycles\n", cycles); print("simd 512 float32 mul took %llu cycles\n", cycles);
memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float));
memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float));
start = os_get_current_cycle_count(); start = rdtsc();
for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 8) { for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 8) {
simd_mul_float32_256_aligned(&samples_a[i], &samples_b[i], &samples_a[i]); simd_mul_float32_256_aligned(&samples_a[i], &samples_b[i], &samples_a[i]);
} }
end = os_get_current_cycle_count(); end = rdtsc();
cycles = end-start; cycles = end-start;
print("simd 256 float32 mul took %llu cycles\n", cycles); print("simd 256 float32 mul took %llu cycles\n", cycles);
memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float));
memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float));
start = os_get_current_cycle_count(); start = rdtsc();
for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 4) { for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 4) {
simd_mul_float32_128_aligned(&samples_a[i], &samples_b[i], &samples_a[i]); simd_mul_float32_128_aligned(&samples_a[i], &samples_b[i], &samples_a[i]);
} }
end = os_get_current_cycle_count(); end = rdtsc();
cycles = end-start; cycles = end-start;
print("simd 128 float32 mul took %llu cycles\n", cycles); print("simd 128 float32 mul took %llu cycles\n", cycles);
memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float));
memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float));
start = os_get_current_cycle_count(); start = rdtsc();
for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 2) { for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 2) {
simd_mul_float32_64(&samples_a[i], &samples_b[i], &samples_a[i]); simd_mul_float32_64(&samples_a[i], &samples_b[i], &samples_a[i]);
} }
end = os_get_current_cycle_count(); end = rdtsc();
cycles = end-start; cycles = end-start;
print("simd 64 float32 mul took %llu cycles\n", cycles); print("simd 64 float32 mul took %llu cycles\n", cycles);
memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_a, 2, _TEST_NUM_SAMPLES*sizeof(float));
memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float)); memset(samples_b, 2, _TEST_NUM_SAMPLES*sizeof(float));
start = os_get_current_cycle_count(); start = rdtsc();
for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 1) { for (u64 i = 0; i < _TEST_NUM_SAMPLES; i += 1) {
samples_a[i] = samples_a[i] + samples_b[i]; samples_a[i] = samples_a[i] + samples_b[i];
} }
end = os_get_current_cycle_count(); end = rdtsc();
cycles = end-start; cycles = end-start;
print("NO SIMD float32 mul took %llu cycles\n", cycles); print("NO SIMD float32 mul took %llu cycles\n", cycles);
} }
@ -1011,13 +1021,13 @@ void test_linmath() {
assert(mixed_v4_result.x == 1.0f && mixed_v4_result.y == 2.0f && mixed_v4_result.z == 3.0f && mixed_v4_result.w == 4.0f, "Mixed Vector4 scalar multiplication failed"); assert(mixed_v4_result.x == 1.0f && mixed_v4_result.y == 2.0f && mixed_v4_result.z == 3.0f && mixed_v4_result.w == 4.0f, "Mixed Vector4 scalar multiplication failed");
float v2_dot = v2_dot_product(v2(2, 7), v2(3, 2)); float v2_dot_product = v2_dot(v2(2, 7), v2(3, 2));
float v3_dot = v3_dot_product(v3(2, 7, 2), v3(3, 2, 9)); float v3_dot_product = v3_dot(v3(2, 7, 2), v3(3, 2, 9));
float v4_dot = v4_dot_product(v4(2, 7, 6, 1), v4(3, 2, 1, 4)); float v4_dot_product = v4_dot(v4(2, 7, 6, 1), v4(3, 2, 1, 4));
assert(floats_roughly_match(v2_dot, 20), "Failed: v2_dot_product"); assert(floats_roughly_match(v2_dot_product, 20), "Failed: v2_dot");
assert(floats_roughly_match(v3_dot, 38), "Failed: v3_dot_product"); assert(floats_roughly_match(v3_dot_product, 38), "Failed: v3_dot");
assert(floats_roughly_match(v4_dot, 30), "Failed: v4_dot_product"); assert(floats_roughly_match(v4_dot_product, 30), "Failed: v4_dot");
} }
void test_hash_table() { void test_hash_table() {
Hash_Table table = make_hash_table(string, int, get_heap_allocator()); Hash_Table table = make_hash_table(string, int, get_heap_allocator());
@ -1061,7 +1071,7 @@ void test_hash_table() {
void test_random_distribution() { void test_random_distribution() {
int bins[NUM_BINS] = {0}; int bins[NUM_BINS] = {0};
seed_for_random = os_get_current_cycle_count(); seed_for_random = rdtsc();
for (int i = 0; i < NUM_SAMPLES; i++) { for (int i = 0; i < NUM_SAMPLES; i++) {
f32 rand_val = get_random_float32(); f32 rand_val = get_random_float32();
int bin = (int)(rand_val * NUM_BINS); int bin = (int)(rand_val * NUM_BINS);
@ -1124,16 +1134,16 @@ void test_mutex() {
const int num_threads = 100; const int num_threads = 100;
Thread **threads = alloc(allocator, sizeof(Thread*)*num_threads); Thread *threads = alloc(allocator, sizeof(Thread)*num_threads);
for (u64 i = 0; i < num_threads; i++) { for (u64 i = 0; i < num_threads; i++) {
threads[i] = os_make_thread(mutex_test_increment_counter, allocator); os_thread_init(&threads[i], mutex_test_increment_counter);
threads[i]->data = &data; threads[i].data = &data;
} }
for (u64 i = 0; i < num_threads; i++) { for (u64 i = 0; i < num_threads; i++) {
os_start_thread(threads[i]); os_thread_start(&threads[i]);
} }
for (u64 i = 0; i < num_threads; i++) { for (u64 i = 0; i < num_threads; i++) {
os_join_thread(threads[i]); os_thread_join(&threads[i]);
} }
assert(data.counter == num_threads * MUTEX_TEST_TASK_COUNT, "Failed: Counter does not match expected value after threading tasks"); assert(data.counter == num_threads * MUTEX_TEST_TASK_COUNT, "Failed: Counter does not match expected value after threading tasks");
@ -1167,9 +1177,9 @@ void test_sort() {
u64 sort_value_offset_in_item = offsetof(Draw_Quad, z); u64 sort_value_offset_in_item = offsetof(Draw_Quad, z);
float64 start_seconds = os_get_current_time_in_seconds(); float64 start_seconds = os_get_current_time_in_seconds();
u64 start_cycles = os_get_current_cycle_count(); u64 start_cycles = rdtsc();
radix_sort(items, buffer, item_count, item_size, sort_value_offset_in_item, id_bits); radix_sort(items, buffer, item_count, item_size, sort_value_offset_in_item, id_bits);
u64 end_cycles = os_get_current_cycle_count(); u64 end_cycles = rdtsc();
float64 end_seconds = os_get_current_time_in_seconds(); float64 end_seconds = os_get_current_time_in_seconds();
for (u64 i = 1; i < item_count; i++) { for (u64 i = 1; i < item_count; i++) {
@ -1195,9 +1205,9 @@ void test_sort() {
u64 sort_value_offset_in_item = offsetof(Draw_Quad, z); u64 sort_value_offset_in_item = offsetof(Draw_Quad, z);
float64 start_seconds = os_get_current_time_in_seconds(); float64 start_seconds = os_get_current_time_in_seconds();
u64 start_cycles = os_get_current_cycle_count(); u64 start_cycles = rdtsc();
merge_sort(items, buffer, item_count, item_size, compare_draw_quads); merge_sort(items, buffer, item_count, item_size, compare_draw_quads);
u64 end_cycles = os_get_current_cycle_count(); u64 end_cycles = rdtsc();
float64 end_seconds = os_get_current_time_in_seconds(); float64 end_seconds = os_get_current_time_in_seconds();
for (u64 i = 1; i < item_count; i++) { for (u64 i = 1; i < item_count; i++) {

View file

@ -12,7 +12,7 @@
- All FILE * stdio procedures are replaced with the equivalent for the oogabooga - All FILE * stdio procedures are replaced with the equivalent for the oogabooga
file API. file API.
- all draw_line -> stb_vorbis_draw_line
*/ */
@ -2125,7 +2125,7 @@ static float inverse_db_table[256] =
int8 integer_divide_table[DIVTAB_NUMER][DIVTAB_DENOM]; // 2KB int8 integer_divide_table[DIVTAB_NUMER][DIVTAB_DENOM]; // 2KB
#endif #endif
static __forceinline void draw_line(float *output, int x0, int y0, int x1, int y1, int n) static __forceinline void stb_vorbis_draw_line(float *output, int x0, int y0, int x1, int y1, int n)
{ {
int dy = y1 - y0; int dy = y1 - y0;
int adx = x1 - x0; int adx = x1 - x0;
@ -3186,13 +3186,13 @@ static int do_floor(vorb *f, Mapping *map, int i, int n, float *target, YTYPE *f
int hy = finalY[j] * g->floor1_multiplier; int hy = finalY[j] * g->floor1_multiplier;
int hx = g->Xlist[j]; int hx = g->Xlist[j];
if (lx != hx) if (lx != hx)
draw_line(target, lx,ly, hx,hy, n2); stb_vorbis_draw_line(target, lx,ly, hx,hy, n2);
CHECK(f); CHECK(f);
lx = hx, ly = hy; lx = hx, ly = hy;
} }
} }
if (lx < n2) { if (lx < n2) {
// optimization of: draw_line(target, lx,ly, n,ly, n2); // optimization of: stb_vorbis_draw_line(target, lx,ly, n,ly, n2);
for (j=lx; j < n2; ++j) for (j=lx; j < n2; ++j)
LINE_OP(target[j], inverse_db_table[ly]); LINE_OP(target[j], inverse_db_table[ly]);
CHECK(f); CHECK(f);

View file

@ -101,3 +101,4 @@ void merge_sort(void *collection, void *help_buffer, u64 item_count, u64 item_si
} }
} }
} }